4
Reading CSV and JSON Files and Solving Problems
When working with data, we come across several different types of data, such as structured, semi-structured, and non-structured, and some specifics from other systems’ outputs. Yet two widespread file types are ingested, comma-separated values (CSV) and JavaScript Object Notation (JSON). There are many applications for these two files, which are widely used for data ingestion due to their versatility.
In this chapter, you will learn more about these file formats and how to ingest them using Python and PySpark, apply the best practices, and solve ingestion and transformation-related problems.
In this chapter, we will cover the following recipes:
- Reading a CSV file
- Reading a JSON file
- Creating a ...
Get Data Ingestion with Python Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.