In Chapter 3, More Than Just One Predictor – Multiple Linear Regression, we saw that multiple linear regression models are easy to assemble, and they are also easy to interpret. These models are particularly accurate in many cases, especially when the relationship between the response and the predictors is clearly linear. However, it is often the case that not all variables used in a multiple regression model are associated with the response.
Unrelated variables with the response are not only irrelevant, but their presence leads to useless complexity in the model. By removing them, we can get a more easily interpretable model.
Subset selection refers to the task of finding a small subset of available independent ...