Chapter 4. Applying MapReduce patterns to big data


This chapter covers
  • Learning how to join data with map-side and reduce-side joins
  • Understanding how a secondary sort works
  • Discovering how partitioning works and how to globally sort data


With your data safely in HDFS, it’s time to learn how to work with that data in MapReduce. Previous chapters showed you some MapReduce snippets in action when working with data serialization. In this chapter we’ll look at how to work effectively with big data in MapReduce to solve common problems.


Mapreduce Basics

If you want to understand the mechanics of MapReduce and how to write basic MapReduce ...

Get Hadoop in Practice now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.