In addition to using metrics to evaluate classification problems, we can turn to visualizations. By plotting the true positive rate (sensitivity) versus the false positive rate (1 - specificity), we get the Receiver Operating Characteristic (ROC) curve. This curve allows us to visualize the trade-off between the true positive rate and the false positive rate. We can identify a false positive rate that we are willing to accept and use that to find the threshold to use as a cutoff when predicting the class with probabilities using the predict_proba() method in scikit-learn. Say that we find the threshold to be 60%—we would require predict_proba() to return a value greater than or equal to 0.6 to predict the positive class (predict() ...
ROC curve
Get Hands-On Data Analysis with Pandas now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.