Every day there are averages mentioned in the press and other popular media. Occasionally, an average is accompanied by some information about the frequency distribution from which it was calculated, but mostly only the average is reported.
Here is a typical example, quoted from a newspaper: ‘It takes an average of 17 months and 26 days to get over a divorce, according to a survey released yesterday. That’s the time it takes to resolve contentious issues, such as child custody, property problems and money worries.’ Even supposing that this survey was done in a way which permits valid generalisation to the whole community, and that we knew how this average was calculated, it is quite obvious that the average alone provides an incomplete and minimally informative picture of the time it takes people, in general, to get over a divorce.
To get something more useful from the survey results, we need to know, in addition – at the very least – the number of respondents to the survey, and some measure of the spread of values in the sample of the variable being studied. While the number of respondents is sometimes mentioned in media reports, a measure of the spread of the data on the survey variable is hardly ever given.
Why might this be? And what could be done to remedy the situation?
Let us start by reviewing how beginning students of statistics build their knowledge of the subject, and note some interesting sidelights along the way. ...