In order to test the KNN algorithm, we are going to use the MNIST handwritten digit dataset provided directly by Scikit-Learn. It is made up of 1,797 8 × 8 grayscale images representing the digits from 0 to 9. The first step is loading it and normalizing all the values to be bounded between 0 and 1:
import numpy as npfrom sklearn.datasets import load_digitsdigits = load_digits()X_train = digits['data'] / np.max(digits['data'])
The dictionary digits contains both the images, digits['images'], and the flattened 64-dimensional arrays, digits['data']. Scikit-Learn implements different classes (for example, it's possible to work directly with KD Trees and Ball Trees using the KDTree and BallTree classes) that can ...