Binary classification

In addition to the error measures shown in the preceding section, in problems where you have only two output classes (for instance, if you have to guess the gender of a user or predict whether the user will click/buy/like the item), there are some additional measures. The most used one, since it's very informative, is the area under the receiver operating characteristics curve (ROC) or area under a curve (AUC).

The ROC curve is a graphical way to express how the performances of the classifier change over all the possible classification thresholds (that is, changes in the outcome when its parameters change). Specifically, these performances have a true positive (or hit) rate and a false positive (or miss) rate. The first ...

Get Python Data Science Essentials - Third Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.