## The Multiple Regression Model

There are several important issues involved in carrying out a multiple regression:

- which explanatory variables to include;
- curvature in the response to the explanatory variables;
- interactions between explanatory variables;
- correlation between explanatory variables;
- the risk of overparameterization.

The assumptions about the response variable are the same as with simple linear regression: the errors are normally distributed, the errors are confined to the response variable, and the variance is constant. The explanatory variables are assumed to be measured without error. The model for a multiple regression with two explanatory variables (*x*_{1} and *x*_{2}) looks like this:

The *i*th data point, *y*_{i}, is determined by the levels of the two continuous explanatory variables *x*_{1i} and *x*_{2i} by the model's three parameters (the intercept β_{0} and the two slopes β_{1} and β_{2}), and by the residual _{i} of point *i* from the fitted surface. More generally, the model is presented like this:

where the summation term is called the **linear predictor** and can involve many explanatory variables, non-linear terms and interactions.

### Example

Let's begin with an example from air pollution studies. ...