59
3
Random Variables,
Distributions, Moments,
and Statistics
3.1 RANDOM VARIABLES
We can dene a random variable (RV) once we have dened the sample space, based on the possible
events and their probabilities. An RV is a rule, or a function, or a map associating a number to each
event in the sample space (see Figure 3.1).
Example: In the roll of six-sided die, the events are A
i
= side facing up is i where i = 1, 2, …,
6 with P[A
i
] = 1/6. We can make a “discrete” RV, denoted by X, associating each event with
the value of the RV; for example, X taking values X = {1, 2, 3, 4, 5, 6} each with P[x
i
] = 1/6.
We refer to the type of RV described in the example as discrete because its values are discrete, that
is, a set of numbers, in this case integers 1, 2, …, 6. The events can be dened from intervals con-
tained in a range of real values from a to b. In this case, the values of RV X are continuous in this
range. We call this type of RV continuous.
Example: We measure concentration of a mineral (in ppm) at a given location and it can take
values between 0 and 10,000 ppm. Continuous random variable X is concentration. An event
could be dened as A = measured concentration is in the interval 1015 ppm.
3.2 DISTRIBUTIONS
RV distributions are dened in the following manner. Here we assume that X is an RV.
3.2.1 probability Mass anD Density functions (pmf and pdf)
A discrete distribution or probability “mass” function (pmf) p(X) is a set of probabilities, one for
each value of X. More precisely, denoting x
i
as the values of X
px PX x
ii
()
[]==
(3.1)
for all values x
i
of X
01≤≤
px
i
() for all i (3.2)
px
i
i
()
= 1 (3.3)
60 Data Analysis and Statistics for Geography, Environmental Science, and Engineering
The last equation says that the total probability (sample space) must be equal to 1 when all prob-
abilities are summed over all i.
Example: Toss a coin. Assign 0 to T and 1 to H. p(0) = 0.5, p(1) = 0.5. This is an example of a
uniform discrete RV: probabilities of each event are the same.
We can represent the probabilities as a graph as illustrated in Figure 3.2 for the example shown
earlier. Vertical thick arrows or bars represent a spike or impulse with intensity given by the height
of the spike and equals the probability of that particular value. Alternatively, it can be represented
as a bar graph where the height of each bar represents the probability (see Figure 3.2).
Example: Roll a six-side die. Then the pmf is p(x
i
) = 1/6 where x
i
= {1, 2, …, 6}. This is also a
uniform discrete RV (see Figure 3.3).
For more examples of illustrations of pmfs, see Davis, 2002, Chapter 2.
A continuous distribution or probability “density” function (pdf) p(X) is dened based on inter-
vals; the probability of the value being in an “innitesimal” (this is a calculus concept, basically
means “very, very small”) interval of X between x and x + dx, this to say
pxdx Px Xxdx() []=<≤+
(3.4)
123456
0.10
0.18
X
p(X)
1/6
123456
X
p(X)
0.00
0.10
0.20
1/6
FIGURE 3.3 pmf of a discrete RV for a six-side die. Spike and bar graphs.
0.0 0.2 0.4 0.6 0.8 1.0
0.3
0.5
0.7
X
p(X)
01
X
p(X)
0.0
0.2
0.4
FIGURE 3.2 pmf of a discrete RV represented as a spike graph and as bar graph.
U
B
A
X =0
X =1
Event spaceValues of random variables Probabilities
P(X =0)=
0.3
P(X =1)=
0.7
FIGURE 3.1 Constructing a random variable.

Get Data Analysis and Statistics for Geography, Environmental Science, and Engineering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.