Chapter 6. Classification and Regression Trees at Scale

In this chapter, we will focus on scalable methods for classification and regression trees. The following topics will be covered:

Tips and tricks for fast random forest applications in Scikit-learn
Additive random forest models and subsampling
GBM gradient boosting
XGBoost together with streaming methods
Very fast GBM and random forest in H2O

The aim of a decision tree is to learn a series of decision rules to infer the target labels based on the training data. Using a recursive algorithm, the process starts at the tree root and splits the data on the feature that results in the lowest impurity. Currently, the most widely applicable scalable tree-based applications are based on CART. Introduced ...

Get Large Scale Machine Learning with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Large Scale Machine Learning with Python by Bastiaan Sjardin, Luca Massaron, Alberto Boschetti

Chapter 6. Classification and Regression Trees at Scale

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly