Recipe 5-1. Aggregate data on a single key
Recipe 5-2. Aggregate data on multiple keys
Recipe 5-3. Create a contingency table
Recipe 5-4. Perform joining operations on two DataFrames
Recipe 5-5. Vertically stack two DataFrames
Recipe 5-6. Horizontally stack two DataFrames
Recipe 5-7. Perform ...
© Raju Kumar Mishra and Sundar Rajan Raman 2019
Raju Kumar Mishra and Sundar Rajan RamanPySpark SQL Recipeshttps://doi.org/10.1007/978-1-4842-4335-0_55. Data Merging and Data Aggregation Using PySparkSQL
(1)
Bangalore, Karnataka, India
(2)
Chennai, Tamil Nadu, India
Data merging and data aggregation are an essential part of the day-to-day activities of PySparkSQL users. This chapter will discuss and describe the following recipes.
Get PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.