Skip to Content
Python Data Analysis Cookbook
book

Python Data Analysis Cookbook

by Ivan Idris
July 2016
Beginner to intermediate
462 pages
9h 14m
English
Packt Publishing
Content preview from Python Data Analysis Cookbook

Bagging to improve results

Bootstrap aggregating or bagging is an algorithm introduced by Leo Breiman in 1994, which applies bootstrapping to machine learning problems. Bagging was also mentioned in the Learning with random forests recipe.

The algorithm aims to reduce the chance of overfitting with the following steps:

  1. We generate new training sets from input training data by sampling with replacement.
  2. Fit models to each generated training set.
  3. Combine the results of the models by averaging or majority voting.

The scikit-learn BaggingClassifier class allows us to bootstrap training examples, and we can also bootstrap features as in the random forests algorithm. When we perform a grid search, we refer to hyperparameters of the base estimator with the ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Machine Learning Cookbook - Second Edition

Python Machine Learning Cookbook - Second Edition

Giuseppe Ciaburro, Prateek Joshi
Python: End-to-end Data Analysis

Python: End-to-end Data Analysis

Phuong Vothihong, Martin Czygan, Ivan Idris, Magnus Vilhelm Persson, Luiz Felipe Martins
Python Data Science Essentials - Third Edition

Python Data Science Essentials - Third Edition

Alberto Boschetti, Luca Massaron, Pietro Marinelli, Matteo Malosetti

Publisher Resources

ISBN: 9781785282287Supplemental Content