July 2017
Intermediate to advanced
382 pages
9h 13m
English
However, if you want to classify the full dataset, we need a more sophisticated approach. We turn to scikit-learn's naive Bayes classifier, as it understands how to handle sparse matrices. In fact, if you didn't pay attention and treated X_train like every NumPy array before, you might not even notice that anything is different:
In [17]: from sklearn import naive_bayes... model_naive = naive_bayes.MultinomialNB()... model_naive.fit(X_train, y_train)Out[17]: MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)
Here we used MultinomialNB from the naive_bayes module, which is the version of naive Bayes classifier that is best suited to handle categorical data, such as word counts.
The classifier is trained ...
Read now
Unlock full access