Commonly Supported File Formats
We've already seen the ease with which you can manipulate text files using Spark with the textFile()
method on SparkContext
. However, you'll be pleased to know that Apache Spark supports a large number of other formats, which are increasing with every release of Spark. With Apache Spark release 2.0, the following file formats are supported out of the box:
- TextFiles (already covered)
- JSON files
- CSV Files
- Sequence Files
- Object Files
Get Learning Apache Spark 2 now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.