June 2017
Beginner to intermediate
576 pages
15h 22m
English
In certain cases in which you have a large number of variables, you can end up overfitting the model. This will result in a biased model with some variables having more of an influence on the prediction than is reasonable to expect.
One way to deal with this is to select a subset of the original variables. We demonstrated all subsets regression which only considers a reduced number of variables for a final model. However, in some cases you may want to keep all or most of the variables in your model. For example, if you know via past studies that particular variables should always be included, you would want to keep that in the model, even though it might drop out of contention after a variable selection procedure has been run. ...