May 2019
Intermediate to advanced
664 pages
15h 41m
English
Here, we are going to put the numeric features into a dataframe along with the quantitative response. Then, we'll carve this up into train and test sets with an 80/20 split. As a closing effort, we'll scale the data, which is required for PCA.
Here, I grab those input features, including height in inches, while dropping weight in kilograms. I also include the subjectid:
> army_subset <- armyClean[, c(1:91, 93, 94, 106, 107)]
We've used the dplyr and caret packages to create train and test sets, and here I demonstrate the dplyr method:
> set.seed(1812)> army_subset %>% dplyr::sample_frac(.8) -> train> army_subset %>% dplyr::anti_join(train, by = "subjectid") -> test
I mentioned previously that this data had a ...