Stacking for regression

Here, we will try to create a stacking ensemble for the diabetes regression dataset. The ensemble will consist of a 5-neighbor k-Nearest Neighbors (k-NN), a decision tree limited to a max depth of four, and a ridge regression (a regularized form of least squares regression). The meta-learner will be a simple Ordinary Least Squares (OLS) linear regression.

First, we have to import the required libraries and data. Scikit-learn provides a convenient method to split data into K-folds, with the KFold class from the sklearn.model_selection module. As in previous chapters, we use the first 400 instances for training and the remaining instances for testing:

# --- SECTION 1 ---# Libraries and data loadingfrom sklearn.datasets ...

Get Hands-On Ensemble Learning with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.