In our previous recipe, we implemented a simple MapReduce job using the Java API of Hadoop. The use case was the same as what we did in the recipes in Chapter 3, Programming Language Drivers where we implemented MapReduce using the Mongo client APIs in Python and Java. In this recipe, we will use Hadoop streaming to implement MapReduce jobs.
The concept of streaming works on communication using
stdout. You can get more information on Hadoop streaming and how it works at http://hadoop.apache.org/docs/r1.2.1/streaming.html.
Refer to the Executing our first sample MapReduce job using the mongo-hadoop connector recipe in this chapter to see how to set up Hadoop for development ...