Chapter 2. Computational Statistics
Distributions
In statistics a distribution is a set of values and their corresponding probabilities.
For example, if you roll a six-sided die, the set of possible values is the numbers 1 to 6, and the probability associated with each value is 1/6.
As another example, you might be interested in how many times each word appears in common English usage. You could build a distribution that includes each word and how many times it appears.
To represent a distribution in Python, you could use a dictionary
that maps from each value to its probability. I have written a class
called Pmf
that uses a Python
dictionary in exactly that way, and provides a number of useful methods. I
called the class Pmf in reference to a probability
mass function, which is a way to represent a distribution
mathematically.
Pmf
is defined in a Python module
I wrote to accompany this book, thinkbayes.py
. You can download it from http://thinkbayes.com/thinkbayes.py.
For more information see “Working with the code”.
To use Pmf
you can import it like
this:
from thinkbayes import Pmf
The following code builds a Pmf to represent the distribution of outcomes for a six-sided die:
pmf = Pmf() for x in [1,2,3,4,5,6]: pmf.Set(x, 1/6.0)
Pmf
creates an empty
Pmf with no values. The Set
method sets the probability associated with
each value to .
Here’s another example that counts the number of times each ...
Get Think Bayes now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.