July 2017
Intermediate to advanced
796 pages
18h 55m
English
SequenceFileRDD is created from a SequenceFile which is a format of files in the Hadoop File System. The SequenceFile can be compressed or uncompressed.
The following is an example of a SequenceFileRDD, which shows how we can write and read SequenceFile:
scala> val pairRDD = statesPopulationRDD.map(record => (record.split(",")(0), record.split(",")(2)))pairRDD: org.apache.spark.rdd.RDD[(String, String)] = MapPartitionsRDD[60] at map at <console>:27scala> pairRDD.saveAsSequenceFile("seqfile")scala> val seqRDD = sc.sequenceFile[String, String]("seqfile")seqRDD: ...Read now
Unlock full access