How to do it...

  1. A theme prevalent throughout this book (due to the theme throughout scikit-learn) is reusable classes that fit and transform datasets that can subsequently be used to transform unseen datasets. This is illustrated as follows:
from sklearn import preprocessingimpute = preprocessing.Imputer()iris_X_prime = impute.fit_transform(iris_X)iris_X_prime[:5]array([[ 5.82616822,  3.5       ,  1.4       ,  1.22589286],
       [ 4.9       ,  3.        ,  1.4       ,  0.2       ],
       [ 4.7       ,  3.2       ,  1.3       ,  0.2       ],
       [ 5.82616822,  3.1       ,  1.5       ,  0.2       ],
       [ 5.        ,  3.6       ,  1.4       ,  1.22589286]])
  1. Notice the difference in the position [0, 0]:
iris_X_prime[0, 0]5.8261682242990664iris_X[0, 0] nan

Get scikit-learn Cookbook - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.