We want to be a little more formal when we talk about a good classifier. What does that mean? The performance of a classifier is a measure of its effectiveness. The simplest performance measure is accuracy: given a classifier and an evaluation dataset, it measures the proportion of instances correctly classified by the classifier. First, let's test the accuracy on the training set:
>>> from sklearn import metrics >>> y_train_pred = clf.predict(X_train) >>> print metrics.accuracy_score(y_train, y_train_pred) 0.821428571429
This figure tells us that 82 percent of the training set instances are correctly classified by our classifier.
Probably, the most important thing you should learn from this chapter is that measuring accuracy ...