October 2019
Intermediate to advanced
316 pages
9h 45m
English
Identifying the most important variables in data with random forests can be done using the following steps:
library(randomForest) train_rows <- sample(nrow(iris), 0.8 * nrow(iris), replace = FALSE) train_set <- iris[train_rows, ] test_set <- iris[-train_rows, ]
model <- randomForest(Species ~ . , data = train_set, mtry = 2, importance = TRUE) varImpPlot(model)
Read now
Unlock full access