O'Reilly logo

Programming MapReduce with Scalding by Antonios Chalkiopoulos

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 3. Scalding by Example

This chapter presents how to read and write local and Hadoop Distributed File System (HDFS) files with Scalding. It introduces the complete Scalding core capabilities through the Fields API and serves as a reference to look up how the Scalding commands can be used. In this chapter, we will cover:

  • Map-like operations
  • Join operations
  • Pipe operations
  • Grouping and reducing operations
  • Composite operations

Reading and writing files

Data lives mostly in files stored in the filesystem in semi-structured text files, structured delimited files, or more sophisticated formats such as Avro and Parquet. Logfiles, SQL exports, JSON, XML, and any type of file can be processed with Scalding.

Scalding is capable of reading and writing many ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required