Chapter 3

Probability Distributions

This chapter begins with the basic notions of mathematical statistics

that form the framework for analysis of financial data (see, e.g.,

[1–3]). In Section 3.2, a number of distributions widely used in statis-

tical data analysis are listed. The stable distributions that have become

popular in Econophysics research are discussed in Section 3.3.

3.1 BASIC DEFINITIONS

Consider the random variable (or variate) X. The probability dens-

ity function P(x) defines the probability to find X between a and b

Pr(a X b) ¼

ð

b

a

P(x)dx (3:1:1)

The probability density must be a non-negative function and must

satisfy the normalization condition

ð

X

max

X

min

P(x)dx ¼ 1(3:1:2)

where the interval [X

min

,X

max

] is the range of all possible values of X.

In fact, the infinite limits [1, 1] can always be used since P(x) may

17

be set to zero outside the interval [X

min

,X

max

]. As a rule, the infinite

integration limits are further omitted.

Another way of describing random variable is to use the cumulative

distribution function

Pr(X b) ¼

ð

b

1

P(x)dx (3:1:3)

Obviously, probability satisfies the condition

Pr(X > b) ¼ 1 Pr(X b) (3:1:4)

Two characteristics are used to describe probable values of random

variable X: mean (or expectation) and median. Mean of X is the

average of all possible values of X that are weighed with the prob-

ability density P(x)

m E[X] ¼

ð

xP(x)dx (3:1:5)

Median of X is the value, M, for which

Pr(X > M) ¼ Pr(X < M) ¼ 0:5(3:1:6)

Median is the preferable characteristic of the most probable value for

strongly skewed data samples. Consider a sample of lottery tickets

that has one ‘‘lucky’’ ticket winning one million dollars and 999

‘‘losers.’’ The mean win in this sample is $1000, which does not

realistically describe the lottery outcome. The median zero value is a

much more relevant characteristic in this case.

The expectation of a random variable calculated using some avail-

able information I

t

(that may change with time t) is called conditional

expectation. The conditional probability density is denoted by P(xjI

t

).

Conditional expectation equals

E[X

t

jI

t

] ¼

ð

xP(xjI

t

)dx (3:1:7)

Variance, Var, and the standard deviation, s, are the conventional

estimates of the deviations from the mean values of X

Var[X] s

2

¼

ð

(x m)

2

P(x)dx (3:1:8)

18 Probability Distributions

In financial literature, the standard deviation of price is used to

characterize the price volatility.

The higher-order moments of the probability distributions are

defined as

m

n

E[X

n

] ¼

ð

x

n

P(x)dx (3:1:9)

According to this definition, mean is the first moment (m m

1

), and

variance can be expressed via the first two moments, s

2

¼ m

2

m

2

.

Two other important parameters, skewness S and kurtosis K, are

related to the third and fourth moments, respectively,

S ¼ E[(x m)

3

]=s

3

,K¼ E[(x m)

4

]=s

4

(3:1:10)

Both parameters, S and K, are dimensionless. Zero skewness implies

that the distribution is symmetrical around its mean value. The posi-

tive and negative values of skewness indicate long positive tails and

long negative tails, respectively. Kurtosis characterizes the distribu-

tion peakedness. Kurtosis of the normal distribution equals three.

The excess kurtosis, K

e

¼ K 3, is often used as a measure of devi-

ation from the normal distribution. In particular, positive excess

kurtosis (or leptokurtosis) indicates more frequent medium and large

deviations from the mean value than is typical for the normal distri-

bution. Leptokurtosis leads to a flatter central part and to so-called

fat tails in the distribution. Negative excess kurtosis indicates frequent

small deviations from the mean value. In this case, the distribution

sharpens around its mean value while the distribution tails decay

faster than the tails of the normal distribution.

The joint distribution of two random variables X and Y is the

generalization of the cumulative distribution (see 3.1.3)

Pr(X b, Y c) ¼

ð

b

1

ð

c

1

h(x, y)dxdy (3:1:11)

In (3.1.11), h(x, y) is the joint density that satisfies the normalization

condition

ð

1

1

ð

1

1

h(x, y)dxdy ¼ 1(3:1:12)

Probability Distributions 19

Get *Quantitative Finance for Physicists* now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.