Chapter 2. Foundations of Probability

Have you ever stopped to consider what your meteorologist really means by a 30% chance of rain? Barring a crystal ball, they can’t say for sure that it will rain. That is, they are uncertain about an outcome. What they can do is quantify that uncertainty as a value between 0% (certain it will not rain) and 100% (certain it will rain).

Data analysts, like meteorologists, do not possess crystal balls. Often, we want to make claims about an entire population while only possessing the data for a sample. So we too will need to quantify uncertainty as a probability.

We’ll start this chapter by digging deeper into how probability works and how probabilities are derived. We’ll also use Excel to simulate some of the most important theorems in statistics, which are largely based on probability. This will put you on excellent footing for Chapter 3 and Chapter 4, where we’ll perform inferential statistics in Excel.

Probability and Randomness

Colloquially, we say that something is “random” when it seems out of context or haphazard. In probability, something is random when we know an event will have an outcome, but we’re not sure what that outcome will be.

Take a six-sided die, for example. When we toss the die, we know it will land on one side—it won’t disappear or land on multiple sides. Knowing that we’ll get an outcome, but not which outcome, is what’s meant by randomness in statistics.

Probability and Sample Space

We know that when the die lands, ...

Get Advancing into Analytics now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.