Data Analysis and Statistics for Geography, Environmental Science, and Engineering

455

Multivariate Analysis I

Reducing Dimensionality

14.1 MULTIVARIATE ANALYSIS: EIGEN-DECOMPOSITION

In this chapter and the next, we will work with several or many variables X

. First, we will not

necessarily assume that some of the variables are independent variables X

, inuencing one or

more dependent, or response variable Y. Instead, we try to uncover relationships among all the

variables and ways of reducing the dimensionality of the dataset. In this chapter, we will cover

the methods of principal component analysis (PCA), factor analysis (FA), and correspondence

analysis (CA).

All of these methods are based on the eigenvalues and eigenvectors of selected matrices based

on the variance–covariance matrix or often called the covariance matrix. We rst look at these

matrices and their eigen-decomposition.

14.2 VECTORS AND LINEAR TRANSFORMATION

In the following, we think of n observations of a variable as a column vector n × 1 in such a

way that the entries dene the coordinates of a point in n-space. The length of the vector will

be the distance from the origin of coordinates to the point (Davis, 2002 pp. 141–152, Carr, 1995

pp. 50–57). Denote the direction of the vector by an arrow pointing from the origin toward the

point. For example, vector

v =













in two-dimensional space is as in Figure 14.1. Its length is

21 5224

+= = ..

Usually when a square matrix is postmultiplied by a column vector, the result is a vector with a

different length and different direction than the original vector. For example,

Av =





































0−

will give the result shown in Figure 14.2 with length 30

+= =

14.3 EIGENVALUES AND EIGENVECTORS

However, there is a particular class of vectors for each square matrix that when premultiplied by

the matrix the resulting vector preserves the direction, and only changes the length of the original

vector by a scalar factor. A vector with this property is an eigenvector. The scale factor associated

with the transformation of the eigenvector is the eigenvalue.

456 Data Analysis and Statistics for Geography, Environmental Science, and Engineering

For example, consider the vector

v =













−

0 707

shown in Figure 14.3. When premultiplied by

matrix A as earlier, we get

Av =













−













−













0 707

1 414

1 414−

We can see that it does not change direction but the length has doubled (Figure 14.4). To check that

indeed the length has doubled, calculate the new length

1 414 1 414 20707 20707

40707 40707

.. (.)( .)

(. )(.)

+=×+×

=× +× = 220707 0 707

×+

(. )(

Formally, an eigenvalue, eigenvector pair of A is any real or complex number-vector pair denoted

as (λ, v), such that Av = λv. Even if we multiply this equation by a scalar k, the equality will hold.

−3 −2 −1 0

123

−3

−2

−1

(2,1)

(3,0)

FIGURE 14.2 A vector v transformed to another vector by matrix multiplication.

−3 −2 −1

0123

−3

−2

−1

(2,1)

FIGURE 14.1 A vector in 2D.

457Multivariate Analysis I

Therefore, there are an innite number of eigenvectors associated with a particular eigenvalue. Note

that an eigenvalue is a scalar, whereas an eigenvector is a vector.

As an example, consider the vector given earlier

Av =





































0 707

1 414

070

−

−.

.77

0 707

− .













= v

We can see that λ = 2 is the eigenvalue.

14.3.1 finDinG eiGenvalues

The denition provided earlier Av = λv is equivalent to Av − λv = 0 or equivalently λv − Av = 0.

Factoring out the vector v, and inserting the identity matrix I (remember the identity matrix does

not change the value of the vector), we get

()

= 0.

IAv

−

(14.1)

−3 −2 −1

0123

−3

−2

−1

(−0.707, −0.707)

FIGURE 14.3 A special vector for the matrix A.

−3 −2 −1 0

123

−3

−2

−1

(−0.707,−0.707)

(−0.707 × 2,−0.707 × 2)

FIGURE 14.4 The special vector preserves direction but has doubled in length.

Get Data Analysis and Statistics for Geography, Environmental Science, and Engineering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Data Analysis and Statistics for Geography, Environmental Science, and Engineering by Miguel F. Acevedo

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly