Random forest

The customer satisfaction data was covered in Chapter 3, Logistic Regression. The GitHub links to the CSV and an RData file are as follows:

I'll show you how to load the RData file:

> santander <- readRDS("santander_prepd.RData")

The data has an unbalanced response:

> table(santander$y)    0    1 73012 3008 

We'll split the train and test sets using the same random seed as in Chapter 3, Logistic Regression:

> set.seed(1966)> trainIndex <- caret::createDataPartition(santander$y, p = 0.8, list = FALSE)> train <- santander[trainIndex, ...

Get Advanced Machine Learning with R now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.