O'Reilly logo

Data Analysis with R - Second Edition by Tony Fischetti

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Kitchen sink regression

When the goal of using regression is simply predictive modeling, we often don't care about which particular predictors go into our model, so long as the final model yields the best possible predictions.

A naïve (and awful) approach is to use all the independent variables available to try to model the dependent variable. Let's try this approach by trying to predict mpg from every other variable in the mtcars dataset, using the following code:

  # the period after the squiggly denotes all other variables 
  model <- lm(mpg ~ ., data=mtcars) 
  summary(model) 
  Call: lm(formula = mpg ~ ., data = mtcars) Residuals: Min 1Q Median 3Q Max -3.4506 -1.6044 -0.1196 1.2193 4.6271 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required