There are three kinds of lies: lies, damned lies, and statistics.

—Benjamin Disraeli (1804-1881)

Statistics is the science of quantifying conjectures. How likely is an event? How much does it depend on other events? Was an event due to chance, or is it attributable to another cause? And for whatever answers you might have for these questions, how confident are you that they’re correct?

Statistics is not the same as probability, but the two are deeply intertwined and on occasion blend together. The proper distinction between them is this: probability is a mathematical discipline, and probability problems have unique, correct solutions. Statistics is concerned with the application of probability theory to particular real-world phenomena.

A more colloquial distinction is that probability deals with small
amounts of data, and statistics deals with large amounts. As you saw
in the last chapter, probability uses random numbers and random
variables to represent individual events. Statistics is about
*situations*: given poll results, or medical
studies, or web hits, what can you infer? Probability began with the
study of gambling; statistics has a more sober heritage. It arose
primarily because of the need to estimate population, trade, and
unemployment.

In this chapter, we’ll begin with some simple statistical measures:
mean, median, mode, variance, and standard deviation. Then we’ll
explore *significance tests*, which tell you how sure you can be that some phenomenon ...

Start Free Trial

No credit card required