In this recipe, we will work with an imbalanced dataset and we will plot the ROC and precision-recall curves. The data contains:
- We first load the dataset, and split it into training and testing data:
library(MASS)library(caret)library(PRROC)library(precrec)set.seed(10)data = read.csv("./approved.csv")data = data[,-c(1,7)]data$Approved_ = "not_approved"data$Approved_[data$Approved == 1] <- "approved"data$Approved_ = as.factor(data$Approved_)data = data[,-1]trainIndex <- createDataPartition(data$Approved_, p = .75, list = FALSE, times = 1) traindata <- data[trainIndex,] testdata <- data[-trainIndex,]
- We fit a model using the training data, with a tuneLength=10. This means that caret will construct a grid for us containing ...