When faced with a class imbalance, we use precision-recall curves instead of ROC curves. This curve shows precision versus recall at various probability thresholds that can be used when making predictions. The baseline is a horizontal line at the percentage of the data that belongs to the positive class. We want our curve above this line, with an area under the precision-recall curve (AUPR) greater than that percentage (the higher the better). The ml_utils.classification module contains the function for drawing precision-recall curves and providing the AUPR:
import matplotlib.pyplot as pltfrom sklearn.metrics import ( auc, average_precision_score, precision_recall_curve)def plot_pr_curve(y_test, preds, positive_class=1, ...