July 2017
Intermediate to advanced
796 pages
18h 55m
English
Memory usages by your Spark application and underlying computing nodes can be categorized as execution and storage. Execution memory is used during the computation in merge, shuffles, joins, sorts, and aggregations. On the other hand, storage memory is used for caching and propagating internal data across the cluster. In short, this is due to the large amount of I/O across the network.
Read now
Unlock full access