465Multivariate Analysis I
14.5 PRINCIPAL COMPONENTS ANALYSIS (PCA)
The objective is to explain the data from new variables called principal components. These are lin-
ear combination of the original variables. We want to generate new and fewer variables capable of
explaining most of the variance. The new variables (components) are the eigenvectors of the covari-
ance matrix. These new variables are orthogonal (perpendicular in 2D). All new variables explain
data variance and just a few of them explain most of the variance (Davis, 2002 pp. 509–526).
Consider X to be an n × m data matrix as in the previous section. Its columns correspond to
values of the m variables [X
1
, X
2
, …, X
m
]; the rows to the n individual observations. We assume that
n