Chapter 7

Linear Models

7.1 Introduction

When the normal distribution was introduced in Section 2.7 and, particularly, when the central limit theorem was presented in Section 4.3, its importance in statistics was pointed out. Since the central limit theorem basically says that any quantity that can be seen as a sum of a large number of independent random contributions can be considered to be, at least approximately, normally distributed, it was argued that many quantities that we tend to study in practice satisfy this.

We have already looked at IQ as an example, but this is a somewhat artificial measure that is specifically constructed to be normally distributed. For another, more relevant example, consider body length of a randomly chosen individual. We can easily come up with dozens of factors that affect a person's length like parents' lengths, intake of various nutrients during childhood, exercise, sleep habits, whether the mother smoked or consumed alcohol during pregnancy, access to proper medical care, and so on.1

Therefore, it is quite common to make the assumption that random samples come from normal distributions with unknown mean and variance and in this chapter we will present a number of inference methods that are specifically developed to handle this case. Models that include normally distributed variation are usually called linear models, a term that hopefully will become clearer as we go along.

7.2 Sampling Distributions

Before going into the various inference ...

Get Probability, Statistics, and Stochastic Processes, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.