Chapter 12

Ensemble learning

Abstract

A powerful way to improve performance in machine learning is to combine the predictions of multiple models. This involves constructing an ensemble of classifiers—e.g., a set of decision trees rather than a single tree. We begin by describing bagging and randomization, which both use a single learning algorithm to generate an ensemble predictor. Bagging perturbs the input data using random resampling; randomization introduces a random component into the learning algorithm. The two can be combined, yielding the so-called “random forest” predictor when applied to decision tree learning, along with a variant called “rotation forests.” Decision tree learners are also commonly used when building an ensemble using ...

Get Data Mining, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.