How it works...

This is an example of 4-fold cross-validation because in the cross_val_score function, cv = 4. We split the training data, or CV Set (X_train), into four parts, or folds. We iterate by rotating each fold as the testing set. At first, fold 1 is the testing set while folds 2, 3, and 4 are together the training set. Then fold 2 is the testing set while folds 1, 3, and 4 are the training set. We do this procedure with folds 3 and 4 as well:

Once we split the dataset into folds, we score the algorithm four times:

  1. We train one of the nearest neighbors algorithm on folds 2, 3, and 4.
  2. Then we predict on fold 1, the test fold.
  3. We measure ...

Get scikit-learn Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.