Ensembling decision trees – random forest

The ensemble technique bagging (which stands for bootstrap aggregating), which we briefly mentioned in Chapter 1, Getting Started with Machine Learning and Python, can effectively overcome overfitting. To recap, different sets of training samples are randomly drawn with replacements from the original training data; each resulting set is used to fit an individual classification model. The results of these separately trained models are then combined together through a majority vote to make the final decision.

Tree bagging, described in the preceding section, reduces the high variance that a decision tree model suffers from and hence, in general, performs better than a single tree. However, in some cases, ...

Get Python Machine Learning By Example - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.