Chapter 6. Mean and Median

“...like a statistician who drowned in a lake of average depth six inches.”

Anonymous

Try to think of a time when you listened to a presentation about data that didn’t include either an average or a median value. They’re almost as common as percentages. Whether we’re tracking home prices, the stock market, student test scores, or the price of gasoline, we come face to face with the notion of central tendency on a regular basis.

Why are they so commonly used? As humans, we have a hard time processing a simple list of more than a half dozen values, let alone reams and reams of raw data. The attractiveness of these measures of central tendency is that they condense a lot of data into digestible morsels that carry with them the notion of “typical.”

As useful as these statistics can be to communicate data, they need to be handled with care. In this chapter, we’ll see how they can be put to good use, but we’ll also see how they can mislead.

The three main measures of central tendency are mean, median, and mode. Let’s start with their definitions:

  • The mean (or average) is determined by summing all of the values in a data set and dividing by the number of values. The mean is considered a “representative value,” meaning if you replaced each value in the data set with the mean, the overall sum wouldn’t change.

  • The median is the middle value in a data set in which the values have been placed in order of magnitude. Thus, half the values in the data set are less than the ...

Get Communicating Data with Tableau now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.