7.1 Common behaviors of parsers7.2 Complex ingestion from CSV7.2.1 Desired output7.2.2 Code7.3 Ingesting a CSV with a known schema7.3.1 Desired output7.3.2 Code7.4 Ingesting a JSON file7.4.1 Desired output7.4.2 Code7.5 Ingesting a multiline JSON file7.5.1 Desired output7.5.2 Code7.6 Ingesting an XML file7.6.1 Desired output7.6.2 Code7.7 Ingesting a text file7.7.1 Desired output7.7.2 Code7.8 File formats for big data7.8.1 The problem with traditional file formats7.8.2 Avro is a schema-based serialization format7.8.3 ORC is a columnar storage format7.8.4 Parquet is also a columnar storage format7.8.5 Comparing Avro, ORC, and Parquet7.9 Ingesting Avro, ORC, and Parquet files7.9.1 Ingesting Avro7.9.2 Ingesting ORC7.9.3 Ingesting Parquet7.9.4 Reference table for ingesting Avro, ORC, or ParquetSummary