Comparing results with a dummy classifier

The scikit-learn DummyClassifier class implements several strategies for random guessing, which can serve as a baseline for classifiers. The strategies are as follows:

stratified: This uses the training set class distribution
most_frequent: This predicts the most frequent class
prior: This is available in scikit-learn 0.17 and predicts by maximizing the class prior
uniform: This uses an uniform distribution to randomly sample classes
constant: This predicts a user-specified class

As you can see, some strategies of the DummyClassifier class always predict the same class. This can lead to warnings from some scikit-learn metrics functions. We will perform the same analysis as we did in the Computing precision, ...

Get Python Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Python Data Analysis Cookbook by Ivan Idris

Comparing results with a dummy classifier

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly