June 2018
Intermediate to advanced
379 pages
7h 33m
English
In Chapter 4, we looked at the process of building a decision tree. Decision trees can overfit on top of the data in some cases—for example, when there are outliers in dependent variable. Having correlated independent variables may also result in the incorrect variable being selected for splitting the root node.
Random forest overcomes those challenges by building multiple decision trees, where each decision tree works on a sample of the data. Let’s break down the term: random refers to the random sampling of data from original dataset, and forest refers ...