6TWEAKING THE TREES

AdaBoost is the best off-the-shelf classifier in the world.

—CART co-inventor Leo Breiman, 1996

XGBoost is the algorithm of choice for many winning teams of machine learning competitions.

—Wikipedia entry, 2022

Image

Here we talk about two general techniques in ML, bagging and boosting, and apply them to form extensions of decision tree analysis. The extensions, random forests and tree-based gradient boosting, are widely used—in fact, even more so than individual tree methods.

6.1 Bias vs. Variance, Bagging, and Boosting

For want of a nail the shoe was lost;

for want of a shoe the horse was lost;

and for want of a horse the man was ...

Get The Art of Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.