O'Reilly logo

Programming MapReduce with Scalding by Antonios Chalkiopoulos

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Introducing Cascading

Cascading is an abstraction that empowers us to write efficient MapReduce applications. The API provides a framework for developers who want to think in higher levels and follow Behavior Driven Development (BDD) and Test Driven Development (TDD) to provide more value and quality to the business.

Cascading is a mature library that was released as an open source project in early 2008. It is a paradigm shift and introduces new notions that are easier to understand and work with.

In Cascading, we define reusable pipes where operations on data are performed. Pipes connect with other pipes to create a pipeline. At each end of a pipeline, a tap is used. Two types of taps exist: source, where input data comes from and sink, where the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required