Model selection
What are we to make of all this? We have the confusion matrices from our models to guide us, but we can get a little more sophisticated when it comes to selecting the classification models. An effective tool for a classification model comparison is the Receiver Operating Characteristic (ROC) chart. Very simply, ROC is a technique for visualizing, organizing, and selecting the classifiers based on their performance (Fawcett, 2006). On the ROC chart, the y-axis is the True Positive Rate (TPR) and the x-axis is the False Positive Rate (FPR). The following are the calculations, which are quite simple:
- TPR = Positives correctly classified / total positives
- FPR = Negatives incorrectly classified / total negatives
Plotting the ROC results ...
Get R: Unleash Machine Learning Techniques now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.