Java API to MapReduce
The Java API to MapReduce is exposed by the org.apache.hadoop.mapreduce
package. Writing a MapReduce program, at its core, is a matter of subclassing Hadoop-provided Mapper
and Reducer
base classes, and overriding the map()
and reduce()
methods with our own implementation.
The Mapper class
For our own Mapper
implementations, we will subclass the Mapper
base class and override the map()
method, as follows:
class Mapper<K1, V1, K2, V2> { void map(K1 key, V1 value Mapper.Context context) throws IOException, InterruptedException ... }
The class is defined in terms of the key/value input and output types, and then the map
method takes an input key/value pair as its parameter. The other parameter is an instance of the Context
class ...
Get Learning Hadoop 2 now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.