January 2020
Intermediate to advanced
312 pages
10h 22m
English
This chapter covers
In chapters 7–10, we saw the power of the distributed frameworks in Hadoop and Spark. These frameworks can take advantage of clusters of computers to parallelize massive data processing tasks and complete them in short order. Most of us, however, don’t have access to physical compute clusters.
In contrast, we can all get access to compute clusters from cloud service providers such as Amazon, Microsoft, and Google. These cloud providers have platforms that we can use for ...