Skip to Content
R in a Nutshell, 2nd Edition
book

R in a Nutshell, 2nd Edition

by Joseph Adler
October 2012
Beginner to intermediate
721 pages
21h 38m
English
O'Reilly Media, Inc.
Content preview from R in a Nutshell, 2nd Edition

Subset Selection and Shrinkage Methods

Modeling functions like lm will include every variable specified in the formula, calculating a coefficient for each one. Unfortunately, this means that lm may calculate coefficients for variables that aren’t needed. You can manually tune a model using diagnostics like summary and lm.influence. However, you can also use some other statistical techniques to reduce the effect of insignificant variables or remove them from a model altogether.

Stepwise Variable Selection

A simple technique for selecting the most important variables is stepwise variable selection. The stepwise algorithm works by repeatedly adding or removing variables from the model, trying to “improve” the model at each step. When the algorithm can no longer improve the model by adding or subtracting variables, it stops and returns the new (and usually smaller) model.

Note that “improvement” does not just mean reducing the residual sum of squares (RSS) for the fitted model. Adding an additional variable to a model will not increase the RSS (see a statistics book for an explanation of why), but it does increase model complexity. Typically, AIC (Akaike’s information criterion) is used to measure the value of each additional variable. The AIC is defined as AIC = − 2 ∗ log(L) + k ∗ edf, where L is the likelihood and edf is the equivalent degrees of freedom.

In R, you perform stepwise selection through the step function:

step(object, scope, scale = 0, direction = c("both", "backward", "forward"), ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

R in a Nutshell

R in a Nutshell

Joseph Adler
The R Book, 2nd Edition

The R Book, 2nd Edition

Michael J. Crawley
The R Book

The R Book

Michael J. Crawley
R Packages

R Packages

Hadley Wickham

Publisher Resources

ISBN: 9781449358204Errata Page