Ensembling techniques

Ensemble learning, or ensembling, is the process of combining multiple predictive models to produce a supermodel that is more accurate than any individual model on its own.

  • Regression: We will take the average of the predictions for each model
  • Classification: Take a vote and use the most common prediction, or take the average of the predicted probabilities

Imagine that we are working on a binary classification problem (predicting either 0 or 1).

# ENSEMBLING import numpy as np # set a seed for reproducibility np.random.seed(12345) # generate 1000 random numbers (between 0 and 1) for each model, representing 1000 observations mod1 = np.random.rand(1000) mod2 = np.random.rand(1000) mod3 = np.random.rand(1000) mod4 = np.random.rand(1000) ...

Get Principles of Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.