How it works...
In Step 1, we perform a similar dataset split to those in several previous recipes. Using the sample() function, we create a list of 80% of the row numbers of the original iris data and then, using subsetting and negative subsetting, we extract the rows.
In Step 2, we train the model using the randomForest() function. The first argument here is a formula; we're specifying that Species is the value we wish to predict based on all other variables, which are described by . . data is our train_set object. The key in this recipe is to make sure we set the importance variable to TRUE, meaning the model will test variables that, when left out of the model building, cause the biggest decrease in accuracy. Once the model is built and ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access