Classifying handwritten digits

The Mixed National Institute of Standards and Technology (MNIST) database is a collection of 70,000 images of handwritten digits. The digits were sampled from documents written by employees of the US Census Bureau and from American high school students. The images are grayscale and 28 x 28 pixels in dimension. Let's inspect some of the images using the following script:

# In[1]:import matplotlib.pyplot as pltfrom sklearn.datasets import fetch_mldataimport matplotlib.cm as cmmnist = fetch_mldata('MNIST original', data_home='data/mnist')counter = 1for i in range(1, 4):    for j in range(1, 6):        plt.subplot(3, 5, counter)        plt.imshow(mnist.data[(i - 1) * 8000 + j].reshape((28,          28)), cmap=cm.Greys_r) plt.axis('off') ...

Get Mastering Machine Learning with scikit-learn - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.