March 2019
Beginner to intermediate
182 pages
4h 6m
English
In this chapter, we have learned how to calculate averages with map and reduce. We also learned faster average computations with aggregate. Finally, we learned that pivot tables allow us to aggregate data based on different values of features, and that, with pivot tables in PySpark, we can leverage handy functions, such as reducedByKey or countByKey.
In the next chapter, we will learn about MLlib, which involves machine learning, which is a very hot topic.
Read now
Unlock full access