Chapter 6. Algorithm Chains and Pipelines
For many machine learning algorithms, the particular representation of
the data that you provide is very important, as we discussed in Chapter 4. This starts with scaling the data and combining features by hand and
goes all the way to learning features using unsupervised machine
learning, as we saw in Chapter 3. Consequently, most machine learning
applications require not only the application of a single algorithm, but
the chaining together of many different processing steps and machine
learning models. In this chapter, we will cover how to use the
class to simplify the process of building chains of transformations and
models. In particular, we will see how we can combine
GridSearchCV to search over parameters for all processing steps at
As an example of the importance of chaining models, we noticed that we
can greatly improve the performance of a kernel SVM on the
dataset by using the
MinMaxScaler for preprocessing. Here’s code for
splitting the data, computing the minimum and maximum, scaling the data, and
training the SVM:
# load and split the data
# compute minimum and maximum on the training ...
Get Introduction to Machine Learning with Python now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.