Chapter 5. Modeling Distributions
The distributions we have used so far are called empirical distributions because they are based on empirical observations, which are necessarily finite samples.
The alternative is an analytic distribution, which is characterized by a CDF that is a mathematical function. Analytic distributions can be used to model empirical distributions. In this context, a model is a simplification that leaves out unneeded details. This chapter presents common analytic distributions and uses them to model data from a variety of sources.
The code for this chapter is in
analytic.py. For information about downloading and
working with this code, see Using the Code.
The Exponential Distribution
I’ll start with the exponential distribution because it is relatively simple. The CDF of the exponential distribution is
The parameter, λ, determines the shape of the distribution. Figure 5-1 shows what this CDF looks like with 0.5, 1, and 2.
In the real world, exponential distributions come up when we look at a series of events and measure the times between events, called interarrival times. If the events are equally likely to occur at any time, the distribution of interarrival times tends to look like an exponential distribution.