In scikit-learn, cross-validation can be performed in three steps:
- Load the dataset. Since we already did this earlier, we don't have to do it again.
- Instantiate the classifier:
In [8]: from sklearn.neighbors import KNeighborsClassifier ... model = KNeighborsClassifier(n_neighbors=1)
- Perform cross-validation with the cross_val_score function. This function takes as input a model, the full dataset (X), the target labels (y) and an integer value for the number of folds (cv). It is not necessary to split the data by hand--the function will do that automatically depending on the number of folds. After the cross-validation is completed, the function returns the test scores:
In [9]: from sklearn.model_selection ...