Classifier performance evaluation

So far, we have covered the first machine learning classifier and evaluated its performance by prediction accuracy in-depth. Beyond accuracy, there are several measurements that give us more insights and avoid class imbalance effects.

Confusion matrix summarizes testing instances by their predicted values and true values, presented as a contingency table:

To illustrate, we compute the confusion matrix of our naive Bayes classifier. Here the scikit-learn confusion_matrix function is used, but it is very easy to code it ourselves:

>>> from sklearn.metrics import confusion_matrix>>> confusion_matrix(Y_test, ...

Get Python Machine Learning By Example now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.