455
14
Multivariate Analysis I
Reducing Dimensionality
14.1 MULTIVARIATE ANALYSIS: EIGEN-DECOMPOSITION
In this chapter and the next, we will work with several or many variables X
i
. First, we will not
necessarily assume that some of the variables are independent variables X
i
, inuencing one or
more dependent, or response variable Y. Instead, we try to uncover relationships among all the
variables and ways of reducing the dimensionality of the dataset. In this chapter, we will cover
the methods of principal component analysis (PCA), factor analysis (FA), and correspondence
analysis (CA).
All of these methods are based on the eigenvalues and eigenvectors of selected matrices based
on the variancecovariance matrix or often called the covariance matrix. We rst look at these
matrices and their eigen-decomposition.
14.2 VECTORS AND LINEAR TRANSFORMATION
In the following, we think of n observations of a variable as a column vector n × 1 in such a
way that the entries dene the coordinates of a point in n-space. The length of the vector will
be the distance from the origin of coordinates to the point (Davis, 2002 pp. 141–152, Carr, 1995
pp. 50–57). Denote the direction of the vector by an arrow pointing from the origin toward the
point. For example, vector
v =
2
1
in two-dimensional space is as in Figure 14.1. Its length is
21 5224
22
+= = ..
Usually when a square matrix is postmultiplied by a column vector, the result is a vector with a
different length and different direction than the original vector. For example,
Av =
=
11
24
2
1
3
0
will give the result shown in Figure 14.2 with length 30
93
22
+= =
14.3 EIGENVALUES AND EIGENVECTORS
However, there is a particular class of vectors for each square matrix that when premultiplied by
the matrix the resulting vector preserves the direction, and only changes the length of the original
vector by a scalar factor. A vector with this property is an eigenvector. The scale factor associated
with the transformation of the eigenvector is the eigenvalue.
456 Data Analysis and Statistics for Geography, Environmental Science, and Engineering
For example, consider the vector
v =
0 707
0 707
.
.
shown in Figure 14.3. When premultiplied by
matrix A as earlier, we get
Av =
=
11
24
0 707
0 707
1 414
1 414
.
.
.
.
We can see that it does not change direction but the length has doubled (Figure 14.4). To check that
indeed the length has doubled, calculate the new length
1 414 1 414 20707 20707
40707 40707
22
22
22
.. (.)( .)
(. )(.)
+=×+×
= 220707 0 707
22
×+
(. )(
.)
Formally, an eigenvalue, eigenvector pair of A is any real or complex number-vector pair denoted
as (λ, v), such that Av = λv. Even if we multiply this equation by a scalar k, the equality will hold.
−3 −2 −1 0
123
−3
−2
−1
0
1
2
3
X1
X2
(2,1)
(3,0)
FIGURE 14.2 A vector v transformed to another vector by matrix multiplication.
−3 −2 −1
0123
−3
−2
−1
0
1
2
3
X1
X2
(2,1)
FIGURE 14.1 A vector in 2D.
457Multivariate Analysis I
Therefore, there are an innite number of eigenvectors associated with a particular eigenvalue. Note
that an eigenvalue is a scalar, whereas an eigenvector is a vector.
As an example, consider the vector given earlier
Av =
=
=
11
24
0 707
0 707
1 414
1 414
2
070
.
.
.
.
.77
0 707
2
.
= v
We can see that λ = 2 is the eigenvalue.
14.3.1 finDinG eiGenvalues
The denition provided earlier Av = λv is equivalent to Avλv = 0 or equivalently λvAv = 0.
Factoring out the vector v, and inserting the identity matrix I (remember the identity matrix does
not change the value of the vector), we get
()
= 0.
λ
IAv
(14.1)
−3 −2 −1
0123
−3
−2
−1
0
1
2
3
X1
X2
(−0.707, −0.707)
FIGURE 14.3 A special vector for the matrix A.
−3 −2 −1 0
123
−3
−2
−1
0
1
2
3
X1
X2
(−0.707,−0.707)
(−0.707 × 2,−0.707 × 2)
FIGURE 14.4 The special vector preserves direction but has doubled in length.

Get Data Analysis and Statistics for Geography, Environmental Science, and Engineering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.