Random forest

The customer satisfaction data was covered in Chapter 3, Logistic Regression. The GitHub links to the CSV and an RData file are as follows:

I'll show you how to load the RData file:

> santander <- readRDS("santander_prepd.RData")

The data has an unbalanced response:

> table(santander$y)    0    1 73012 3008 

We'll split the train and test sets using the same random seed as in Chapter 3, Logistic Regression:

> set.seed(1966)> trainIndex <- caret::createDataPartition(santander$y, p = 0.8, list = FALSE)> train <- santander[trainIndex, ]> test <- santander[-trainIndex, ]

With this split, ...

Get Mastering Machine Learning with R - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.