May 2024
Beginner to intermediate
438 pages
9h 41m
English
Apache Spark is a powerful distributed computing framework that can handle large-scale data processing tasks. One of the most common tasks when working with data is loading it from various sources and writing it into various formats. In this hands-on chapter, you will gain a comprehensive understanding of how to transform and manipulate data using Apache Spark.
In this chapter, we’re going to cover the following main recipes: