book

Python Data Analysis Cookbook

by Ivan Idris

July 2016

Beginner to intermediate

462 pages

9h 14m

English

Packt Publishing

Read now

Unlock full access

Content preview from Python Data Analysis Cookbook

Bagging to improve results

Bootstrap aggregating or bagging is an algorithm introduced by Leo Breiman in 1994, which applies bootstrapping to machine learning problems. Bagging was also mentioned in the Learning with random forests recipe.

The algorithm aims to reduce the chance of overfitting with the following steps:

We generate new training sets from input training data by sampling with replacement.
Fit models to each generated training set.
Combine the results of the models by averaging or majority voting.

The scikit-learn BaggingClassifier class allows us to bootstrap training examples, and we can also bootstrap features as in the random forests algorithm. When we perform a grid search, we refer to hyperparameters of the base estimator with the ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Python Machine Learning Cookbook - Second Edition

Giuseppe Ciaburro, Prateek Joshi

Python: End-to-end Data Analysis

Phuong Vothihong, Martin Czygan, Ivan Idris, Magnus Vilhelm Persson, Luiz Felipe Martins

Practical Data Analysis Cookbook

Tomasz Drabas

Python Data Science Essentials - Third Edition

Alberto Boschetti, Luca Massaron, Pietro Marinelli, Matteo Malosetti

Publisher Resources

ISBN: 9781785282287Supplemental Content