This chapter reviews notation and background material in mathematics, probability, and statistics. Readers may wish to skip this chapter and turn directly to Chapter 2, returning here only as needed.
1.1 Mathematical Notation
We use boldface to distinguish a vector x = (x1, . . ., xp) or a matrix M from a scalar variable x or a constant M. A vector-valued function f evaluated at x is also boldfaced, as in f(x) = (f1(x), . . ., fp(x)). The transpose of M is denoted MT.
Unless otherwise specified, all vectors are considered to be column vectors, so, for example, an n × p matrix can be written as M = (x1 . . . xn)T. Let I denote an identity matrix, and 1 and 0 denote vectors of ones and zeros, respectively.
A symmetric square matrix M is positive definite if xTMx > 0 for all nonzero vectors x. Positive definiteness is equivalent to the condition that all eigenvalues of M are positive. M is nonnegative definite or positive semidefinite if xTMx ≥ 0 for all nonzero vectors x.
The derivative of a function f, evaluated at x, is denoted f′(x). When x = (x1, . . ., xp), the gradient of f at x is
The Hessian matrix for f at x is f′′(x) having (i, j)th element equal to d2f(x)/(dxi dxj). The negative Hessian has important uses in statistical inference.
Let J(x) denote the Jacobian matrix evaluated at x for the one-to-one mapping y = f(x). The (i, j)th element of J(x) is equal ...