455
14
Multivariate Analysis I
Reducing Dimensionality
14.1 MULTIVARIATE ANALYSIS: EIGEN-DECOMPOSITION
In this chapter and the next, we will work with several or many variables X
i
. First, we will not
necessarily assume that some of the variables are independent variables X
i
, inuencing one or
more dependent, or response variable Y. Instead, we try to uncover relationships among all the
variables and ways of reducing the dimensionality of the dataset. In this chapter, we will cover
the methods of principal component analysis (PCA), factor analysis (FA), and correspondence
analysis (CA).
All of these methods are based on the eigenvalues and eigenvectors of selected matrices based
on the variance–covariance matrix or often called the covariance matrix. We rst look at these
matrices and their eigen-decomposition.
14.2 VECTORS AND LINEAR TRANSFORMATION
In the following, we think of n observations of a variable as a column vector n × 1 in such a
way that the entries dene the coordinates of a point in n-space. The length of the vector will
be the distance from the origin of coordinates to the point (Davis, 2002 pp. 141–152, Carr, 1995
pp. 50–57). Denote the direction of the vector by an arrow pointing from the origin toward the
point. For example, vector
v =
2
1
in two-dimensional space is as in Figure 14.1. Its length is
22
Usually when a square matrix is postmultiplied by a column vector, the result is a vector with a
different length and different direction than the original vector. For example,
Av =
=
11
24
2
1
3
0−
will give the result shown in Figure 14.2 with length 30
22
+= =
14.3 EIGENVALUES AND EIGENVECTORS
However, there is a particular class of vectors for each square matrix that when premultiplied by
the matrix the resulting vector preserves the direction, and only changes the length of the original
vector by a scalar factor. A vector with this property is an eigenvector. The scale factor associated
with the transformation of the eigenvector is the eigenvalue.