Chapter 7. Batch layer: Illustration

This chapter covers

Sources of complexity in data-processing code
JCascalog as a practical implementation of pipe diagrams
Applying abstraction and composition techniques to data processing

In the last chapter you saw how pipe diagrams are a natural and concise way to specify computations that operate over large amounts of data. You saw that pipe diagrams can be executed as a series of MapReduce jobs for parallelism and scalability.

In this illustration chapter, we’ll look at a tool that’s a fairly direct mapping of pipe diagrams: JCascalog. There’s a lot to cover in JCascalog, so this chapter is a lot more involved than the previous illustration chapters. Like always, you can still learn the full ...

Get Big Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Big Data by James Warren, Nathan Marz

Chapter 7. Batch layer: Illustration

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly