Hadoop enables us to implement and specify custom InputFormat implementations for our MapReduce computations. We can implement custom InputFormat implementations to gain more control over the input data as well as to support proprietary or application-specific input data file formats as inputs to Hadoop MapReduce computations. An InputFormat implementation should extend the
org.apache.hadoop.mapreduce.InputFormat<K,V> abstract class overriding the
In this recipe, we implement an InputFormat and a RecordReader for the HTTP log files. This InputFormat will generate
LongWritable instances as keys and
LogWritable instances as ...