The Multiple Regression Model
There are several important issues involved in carrying out a multiple regression:
- which explanatory variables to include;
- curvature in the response to the explanatory variables;
- interactions between explanatory variables;
- correlation between explanatory variables;
- the risk of overparameterization.
The assumptions about the response variable are the same as with simple linear regression: the errors are normally distributed, the errors are confined to the response variable, and the variance is constant. The explanatory variables are assumed to be measured without error. The model for a multiple regression with two explanatory variables (x1 and x2) looks like this:
The ith data point, yi, is determined by the levels of the two continuous explanatory variables x1i and x2i by the model's three parameters (the intercept β0 and the two slopes β1 and β2), and by the residual i of point i from the fitted surface. More generally, the model is presented like this:
where the summation term is called the linear predictor and can involve many explanatory variables, non-linear terms and interactions.
Example
Let's begin with an example from air pollution studies. ...
Get The R Book now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.