© Raju Kumar Mishra and Sundar Rajan Raman 2019
Raju Kumar Mishra and Sundar Rajan RamanPySpark SQL Recipeshttps://doi.org/10.1007/978-1-4842-4335-0_3

3. IO in PySpark SQL

Raju Kumar Mishra1  and Sundar Rajan Raman2
(1)
Bangalore, Karnataka, India
(2)
Chennai, Tamil Nadu, India
 
Reading data from different types of file formats and saving the result to many data sinks is an inevitable part the data scientist’s job. In this chapter, we are going to learn the following recipes. Through these recipes, we will learn how to read data from different types of data sources and how to save the results of the analysis to different data sinks.
  • Recipe 3-1. Read a CSV file

  • Recipe 3-2. Read a JSON file

  • Recipe 3-3. Save a DataFrame as a CSV file

  • Recipe 3-4. Save a DataFrame ...

Get PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.