September 2016
Beginner to intermediate
344 pages
7h 44m
English
Text classification is about assigning a topic, subject category, genre, or something similar to the text blob. For example, spam filters assign spam or not spam to an email.
Apache Spark supports various classifiers through MLlib and ML packages. The SVM classifier and Naive Bayes classifier are popular classifiers, and the former was already covered in the previous chapter. Let's take a look at the latter now.
The Naive Bayes (NB) classifier is a multiclass probabilistic classifier and is one of the best classification algorithms. It assumes strong independence between every pair of features. It computes the conditional probability distribution of each feature and a given label, and then applies Bayes' ...
Read now
Unlock full access