The following code is, for the most part, a rehash of what we developed in Chapter 2, Linear Regression - The Blocking and Tackling of Machine Learning. We will create the best subset object using the regsubsets() command and specify the train portion of data. The variables that are selected will then be used in a model on the test set, which we will evaluate with a mean squared error calculation.
The model that we are building is written out as lpsa ~ . with the tilde and period stating that we want to use all the remaining variables in our data frame, with the exception of the response:
> subfit <- regsubsets(lpsa ~ ., data = train)
With the model built, you can produce the best subset with two lines of code. The first one ...