3.12 So What do We do with All this Stuff?

How are the various descriptive measures developed in this chapter to be used in analyzing a data set? To answer this question, we need only remember that the characteristics of interest of a data set are shape, central location (“center” for short), and dispersion (“spread” for short). In this regard:

1. If a distribution is symmetric (or nearly so) with no outliers, then img and s provide us with legitimate descriptions of center and spread, respectively. (This is so because the values of img and s are sensitive to both inordinately large or small extreme values of a variable X. And if the distribution lacks symmetry, no single number can adequately describe spread since the two sides of a highly skewed distribution have different spreads.)
2. If a distribution is skewed and/or exhibits outliers, then img and s are inadequate for describing center and spread respectively. In this instance, a boxplot should be used in order to obtain a summary of both center and spread. To construct a boxplot, we need as inputs the five number summary:

img

along with ...

Get Statistical Inference: A Short Course now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.