Key/value pairs are good for solving many problems efficiently in a parallel fashion. Apache Mahout, a machine-learning library that was initially developed on top of Apache Hadoop, implements many machine-learning algorithms in the areas of classification, clustering, and collaborative filtering by using the MapReduce key/value-pair architecture . In this chapter, you’ll work through recipes that develop skills for solving interesting big data problems from many disciplines.
Recipe 5-1. Create a paired RDD
Recipe 5-2. Perform ...