O'Reilly logo

Learning Cascading by Victoria Loewengart, Michael Covert

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Understanding how Cascading represents records

Now that you have gotten a glimpse of how you can implement a data processing system using Cascading, let's dive into the internals of it. In this section, we will learn how to define and structure data streams for Cascading processing.

Using tuples and defining fields

The idea of a tuple is very similar to that of a record in a database.

Tuples (cascading.tuple.Tuple) provide storage for a vector or a collection of values, addressed by offset that can be associated with specific object types and names. A tuple can have data objects of different types. A series of tuples make a tuple stream. A simple example of a tuple is [String name, Integer age]. A tuple has a large set of methods to get, set, append, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required