As stated previously, we'll begin with feature selection. The caret package helps out in this matter. In RFE, a model is built using all features, and a feature importance value is assigned. Then the features are recursively pruned and an optimal number of features selected based on a performance metric such as accuracy. In short, it's a type of backward feature elimination.
To do this, we'll need to set the random seed, specify the cross-validation method in caret's rfeControl() function, perform a recursive feature selection with the rfe() function, and then test how the model performs on the test set. In rfeControl(), you'll need to specify the function based on the model being used. There are several different functions that ...