September 2014
Intermediate to advanced
512 pages
13h 54m
English
Chapter 3. Data serialization—working with text and beyond
Listing 3.1. Extracting content with Java’s STAX parser
Listing 3.2. A reducer to emit start and end tags
Listing 3.3. A Writable implementation to represent a stock price
Listing 3.4. A Pig loader function that converts a StockPriceWritable into a Pig tuple
Listing 3.5. Writing Avro files from outside of MapReduce
Listing 3.6. A RecordWriter that produces MapReduce output in CSV form
Chapter 4. Organizing and optimizing data in HDFS
Listing 4.1. Read a directory containing small files and produce a single Avro file in HDFS
Listing 4.2. A MapReduce job that takes as input Avro files containing the small files
Listing 4.3. Methods to read and write LZOP files in HDFS ...