March 2022
Beginner to intermediate
456 pages
13h
English
With two different kind of programs under your belt, it’s time to expand our horizons. Part 2 is about diversifying your set of tools so that no data set will have a secret for you.
Chapter 6 breaks the rows and columns mold to go multidimensional. Through JSON data, we build data frames that contain data frames themselves. This tool catapults the versatility of the Spark data frame to completely new horizons.
Chapter 7 introduces PySpark and SQL together. Together, they unlock a new level of expressiveness and succinctness in your code, allow you to scale SQL workflows at record speed, and provide a new way to reason about your analyses.
Chapters 8 and 9 cover going full Python with your ...