There are several important issues involved in carrying out a multiple regression:
The assumptions about the response variable are the same as with simple linear regression: the errors are normally distributed, the errors are confined to the response variable, and the variance is constant. The explanatory variables are assumed to be measured without error. The model for a multiple regression with two explanatory variables (x1 and x2) looks like this:
The ith data point, yi, is determined by the levels of the two continuous explanatory variables x1i and x2i by the model's three parameters (the intercept β0 and the two slopes β1 and β2), and by the residual i of point i from the fitted surface. More generally, the model is presented like this:
where the summation term is called the linear predictor and can involve many explanatory variables, non-linear terms and interactions.
Let's begin with an example from air pollution studies. ...