O'Reilly logo

R Projects For Dummies by Joseph Schmuller

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 8

Into the Forest, Randomly

IN THIS CHAPTER

check Looking at random forests

check Growing a random forest for irises

check Developing a random forest for glass identification

In Chapter 7, I help you explore decision trees. Suppose a decision tree is an expert decision-maker: Give a tree a set of data, and it makes decisions about the data. Taking this idea a step further, suppose you have a panel of experts — a group of decision trees — and each one makes a decision about the same data. One could poll the panel to come up with the best decision.

This is the idea behind the random forest — a collection of decision trees that you can poll, and the majority vote is the decision.

Growing a Random Forest

So how does all this happen? How do you create a forest out of a dataset? Well, randomly.

Here's what I mean. In Chapter 7, I discuss the creation of a decision tree from a dataset. I use the rattle package to partition a data frame into a training set, a validation set, and a test set. The partitioning takes place as a result of random sampling from the rows in the data frame. The default condition is that rattle randomly assigns 70 percent of the rows to the training set, 15 percent to the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required