One of the commonest, and simplest, uses of statistical analysis is the fitting of a straight line, known for historical reasons as a *regression line*, to describe the relationship between an *explanatory variable*, *X* and a *response variable*, *Y*. The departure of the values of *Y* from this line is called the *residual variation*, and is regarded as random. It is natural to ask whether the part of the variation in *Y* that is explained by the relationship with *X* is more than could reasonably be expected by chance: or more formally, whether it is *significant* relative to the residual variation. This is a simple *regression analysis*, and for many data sets it is all that is required. However, in some cases, several observations of *Y* are taken at each value of *X*. The data then form natural groups, and it may no longer be appropriate to analyse them as though every observation were independent: observations of *Y* at the same value of *X* may lie at a similar distance from the line. We may then be able to recognize two sources of random variation, namely

- variation among groups
- variation among observations within each group.

This is one of the simplest situations in which it is necessary to consider the possibility that there may be more than a single *stratum* of random variation—or, in the language of mixed modelling, that a model ...

Start Free Trial

No credit card required