
GDPR, the forgotten done right
In the previous article, we covered anonymization and pseudonymization – techniques used in the context of ensuring data privacy, and more specifically, in the…
check out.
How does storage organisation affect query performance?
Despite great efforts to separate interface from the implementation (like SQL), the pesky details always come up important when deploying to production, either when performance…
check out.
Storage organisation vs query performance – examples
The article How does storage organisation affect query performance described a number of principles on how to model data in Amazon…
check out.
Obtaining value from GDPR with solutions that work for your bottom line.
Compliance with the GDPR regulations can be profitable when done right. Apart from saving on legal fees and avoiding customer attrition, you can also…
check out.
How to waste money in the cloud
Expense optimization is often the main reason for migrating from on-premise to the cloud. The combination of pay-as-you-go and flexible provisioning reduces the problem of…
check out.
Spark shuffle – Case #3 – using salt in repartition
Why use salt in repartition? In the previous blog entry we saw how a skew in a processed dataset is affecting performance of Spark…
check out.