O'Reilly logo

Building Machine Learning Systems with Python by Willi Richert, Luis Pedro Coelho

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Creating our first classifier and tuning it

The Naive Bayes classifiers reside in the sklearn.naive_bayes package. There are different kinds of Naive Bayes classifiers:

  • GaussianNB: This assumes the features to be normally distributed (Gaussian). One use case for it could be the classification of sex according to the given height and width of a person. In our case, we are given tweet texts from which we extract word counts. These are clearly not Gaussian distributed.
  • MultinomialNB: This assumes the features to be occurrence counts, which is relevant to us since we will be using word counts in the tweets as features. In practice, this classifier also works well with TF-IDF vectors.
  • BernoulliNB: This is similar to MultinomialNB, but more suited when ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required