File input/output (I/O) operations are an integral part of many software activities and for data
A data scientist deals with many types of files, including text files, comma-separated values (CSV) files, JavaScript Object Notation (JSON) files, and many more. The Hadoop Distributed File System (HDFS) is a very good distributed file system.
Recipe 6-1. Read a simple text file
Recipe 6-2. Write an RDD to a simple text file
Recipe 6-3. Read a directory
Recipe 6-4. Read data from HDFS
Recipe 6-5. Save an RDD to HDFS
Recipe 6-6. Read data from a sequential ...