O'Reilly logo

MongoDB Cookbook - Second Edition by Amol Nayak, Cyrus Dasadia

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Running MapReduce jobs on Hadoop using streaming

In our previous recipe, we implemented a simple MapReduce job using the Java API of Hadoop. The use case was the same as what we did in the recipes in Chapter 3, Programming Language Drivers where we implemented MapReduce using the Mongo client APIs in Python and Java. In this recipe, we will use Hadoop streaming to implement MapReduce jobs.

The concept of streaming works on communication using stdin and stdout. You can get more information on Hadoop streaming and how it works at http://hadoop.apache.org/docs/r1.2.1/streaming.html.

Getting ready…

Refer to the Executing our first sample MapReduce job using the mongo-hadoop connector recipe in this chapter to see how to set up Hadoop for development ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required