Classification using the naïve Bayes classifier

A classifier assigns inputs into one of the N classes based on some properties (also known as features) of inputs. Classifiers have widespread applications, such as e-mail spam filtering, finding the most promising products, selecting customers for closer interactions, and taking decisions in machine learning situations. Let's explore how to implement a classifier using a large dataset. For instance, a spam filter will assign each e-mail to one of the two clusters: spam mail or not spam mail.

There are many classification algorithms. One of the simplest, but effective, algorithm is the naïve Bayesian classifier that uses the Bayes theorem involving conditional probability.

In this recipe, we will also ...

Get Hadoop MapReduce v2 Cookbook - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.