How it works...

This is an example of 4-fold cross-validation because in the cross_val_score function, cv = 4. We split the training data, or CV Set (X_train), into four parts, or folds. We iterate by rotating each fold as the testing set. At first, fold 1 is the testing set while folds 2, 3, and 4 are together the training set. Then fold 2 is the testing set while folds 1, 3, and 4 are the training set. We do this procedure with folds 3 and 4 as well:

Once we split the dataset into folds, we score the algorithm four times:

  1. We train one of the nearest neighbors algorithm on folds 2, 3, and 4.
  2. Then we predict on fold 1, the test fold.
  3. We measure ...

Get scikit-learn Cookbook - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.