Precision-recall curve

When faced with a class imbalance, we use precision-recall curves instead of ROC curves. This curve shows precision versus recall at various probability thresholds that can be used when making predictions. The baseline is a horizontal line at the percentage of the data that belongs to the positive class. We want our curve above this line, with an area under the precision-recall curve (AUPR) greater than that percentage (the higher the better). The ml_utils.classification module contains the function for drawing precision-recall curves and providing the AUPR:

import matplotlib.pyplot as pltfrom sklearn.metrics import (    auc, average_precision_score, precision_recall_curve)def plot_pr_curve(y_test, preds, positive_class=1, ...

Get Hands-On Data Analysis with Pandas now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.