The purpose of random forest is to ensemble weak learners (for example, a single decision tree) into a strong learner. The process of developing a random forest is very similar to the bagging method, assuming that we have a training set containing N samples with M features. The process first performs bootstrap sampling, which samples N cases at random, with the replacement as the training dataset of each single decision tree. Next, in each node, the process first randomly selects m variables (where m << M), and then finds the predictor variable that provides the best split among m variables. Next, the process grows the full tree without pruning. In the end, we can obtain the predicted result of an example from each single ...
How it works...
Get Machine Learning with R Cookbook - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.