Random forests

The final classifier that we will be discussing in this chapter is the aptly named Random Forest and is an example of a meta-technique called ensemble learning. The idea and logic behind random forests follows thusly:

Given that (unpruned) decision trees can be nearly bias-less high variance classifiers, a method of reducing variance at the cost of a marginal increase of bias could greatly improve upon the predictive accuracy of the technique. One salient approach to reducing variance of decision trees is to train a bunch of unpruned decision trees on different random subsets of the training data, sampling with replacement—this is called bootstrap aggregating or bagging. At the classification phase, the test observation is run through ...

Get R: Predictive Analysis now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.