For this example, we will use the breast cancer dataset from the University of Wisconsin. Details on this dataset can be found at the UCI Machine Learning Repository at http://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+%28diagnostic%29.
- The dataset can be loaded using the following lines of code:
library(tidyverse)wbdc <- readr::read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data", col_names = FALSE)
- After loading the data, we will encode the target variable by converting the column from a character column with two character values that indicate whether or not there were signs of malignancy to a numeric data type holding binary values. ...