Tantus Data logo

  • Home
  • Blog
  • Jobs
  • Contact
Facebook Back to homepage

Tagspartitionby

Tags partitionby, repartition, shuffle, spark

Spark shuffle – Case #1 – partitionBy and repartition

10 June 20186 October 2018 by Marcin

This is the first of a series of articles explaining the idea of how the shuffle operation works in Spark and how to use this knowledge in your daily job as a data engineer or data scientist. It will be a case-by-case explanation, so I will start with showing you a code example which does […]

Popular tags

shuffle, repartition, spark, SmartData, hadoop, airflow, cost-optimization, aws, skew, partitionby, Postgres, Sqoop

Social media

Twitter
Tantus Data logo
©2017 Tantus Data.
All rights reserved.