The top-k reduce algorithm is a popular algorithm in MapReduce. The mappers are responsible for emitting top-k records at its level and then reducer filters out top-k records from all the records it received from the mapper. We will be using an example of player score that we used previously. The objective is to find out top-k players with the lowest score. Let's look onto the mapper implementation. We are assuming that each player has a unique score, otherwise the logic will require a little change, and we need to keep a list of players' details in values and emit only 10 records from the cleanup method.
The code for TopKMapper can be seen as follows:
import org.apache.Hadoop.io.IntWritable;import org.apache.Hadoop.io.LongWritable; ...