So far, we have covered the first machine learning classifier and evaluated its performance by prediction accuracy in-depth. Beyond accuracy, there are several measurements that give us more insights and avoid class imbalance effects.
Confusion matrix summarizes testing instances by their predicted values and true values, presented as a contingency table:
To illustrate, we compute the confusion matrix of our naive Bayes classifier. Here the scikit-learn confusion_matrix function is used, but it is very easy to code it ourselves:
>>> from sklearn.metrics import confusion_matrix>>> confusion_matrix(Y_test, ...