Select your language
Why use salt in repartition? In the previous blog entry we saw how a skew in a processed dataset is affecting performance of Spark…
In the previous blog entry we reviewed a Spark scenario where calling the partitionBy method resulted in each task creating as many files as you had days…
This is the first of a series of articles explaining the idea of how the shuffle operation works in Spark and how to use…