Hadoop in Practice, Second Edition

Chapter 6. Applying MapReduce patterns to big data

This chapter covers

Learning how to join data with map-side and reduce-side joins
Understanding how a secondary sort works
Discovering how partitioning works and how to globally sort data

With your data safely in HDFS, it’s time to learn how to work with that data in MapReduce. Previous chapters showed you some MapReduce snippets in action when working with data serialization. In this chapter we’ll look at how to work effectively with big data in MapReduce to solve common problems.

MapReduce basics

If you want to understand the mechanics of Map-Reduce and how to write basic MapReduce programs, it’s worth your time to read Hadoop in Action by Chuck Lam (Manning, 2010).

Get Hadoop in Practice, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Hadoop in Practice, Second Edition by Alex Holmes

Chapter 6. Applying MapReduce patterns to big data

MapReduce basics

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly