July 2018
Beginner to intermediate
406 pages
9h 55m
English
We have to be clear about what we want to measure. The naïve, but easiest, way is to simply calculate the average prediction quality over the test set. This will result in a value between 0 for predicting everything wrongly and 1 for perfect prediction.
Let's for now use the accuracy as the prediction quality, which scikit-learn conveniently calculates for us with knn.score(). But as we learned in Chapter 2, Classifying with Real-world Examples, we will not do it just once, but apply cross-validation here using the readymade KFold class from sklearn.model_selection. Finally, we will average the scores on the test set of each fold and see how much it varies using standard deviation:
from sklearn.neighbors ...
Read now
Unlock full access