all articles
The technicalities
10 min read
Spark shuffle – Case #2 – repartitioning skewed data
In the previous blog entry we reviewed a Spark scenario where calling the partitionBy method resulted in each task creating as many files as you had days…
check out.