Hadoop streaming

We have mentioned previously that MapReduce programs don't have to be written in Java. There are several reasons why you might want or need to write your map and reduce tasks in another language. Perhaps you have existing code to leverage or need to use third-party binaries—the reasons are varied and valid.

Hadoop provides a number of mechanisms to aid non-Java development, primary amongst which are Hadoop pipes that provide a native C++ interface and Hadoop streaming that allows any program that uses standard input and output to be used for map and reduce tasks. With the MapReduce Java API, both map and reduce tasks provide implementations for methods that contain the task functionality. These methods receive the input to the ...

Get Learning Hadoop 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.