Chapter 9. Predictive analytics with Mahout


This chapter covers
  • Using recommenders to make product suggestions
  • Spam email classification with naïve Bayes
  • Clustering to identify trends or patterns in data


Predictive analytics is the field of deriving information from current and historical data. It’s one of the main tools in a data scientist’s tool belt, whose job is to examine large datasets (often called big data these days) and derive meaningful insights from that data, optimally in the form of new products. Predictive analytics can be broken down into three broad categories:

  • Recommender—Recommender systems suggest items based on past behavior or interest. These items can be other users in a social network, or products and services ...

Get Hadoop in Practice now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.