Plot Histograms in Excel
Use Microsoft Excel to plot data distributions so that you can have a better understanding of statistics.
There is some truth to the cliché “a picture is worth a thousand words.” A picture is often the best way to understand 1,000 numbers. People are visually oriented. We’re good at looking at a picture and observing different characteristics; we’re bad at looking at a list of 1,000 numbers.
One of the most powerful tools available for understanding data is the histogram, a picture of the distribution of values. Here is the idea of a histogram. Suppose you have a lot of data—say, the batting averages for all 6,032 baseball players between 1955 and 2004 who averaged 3.1 or more plate appearances per game. Let’s also assume you want to know how these values are distributed. What are the lowest and highest values? Are there more low values than high values? Were batting averages totally random numbers between 0 and .400, or was there some pattern?
Batting average can take many different values. Between 1955 and 2004, 6,032 players had qualifying batting averages, and there were 1,229 unique values for batting average. You can plot the number of players with each unique batting average (though I can’t imagine what this graph would look like). But we don’t really care about each unique value; for example, the fact that 13 players had a batting average of .2862 is not that interesting. Instead, we might want to know the number of players with very similar batting ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access