This chapter will help you understand the basic operations of PySpark. You are encouraged to set up the PySpark environment and try the following operations on any dataset of your choice for enhanced understanding. Since Spark itself is a very big topic, we will give you just enough content to get you started with PySpark basics and concepts before jumping into data-wrangling activities. This chapter will demonstrate the most common data operations in PySpark that you may encounter ...
2. PySpark Basics
Get Applied Data Science Using PySpark: Learn the End-to-End Predictive Model-Building Cycle now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.