♣20♣Factoring Analysis and Principle Components

Factor analysis and principal component analysis are mathematically related: they both rely on calculating eigenvectors (on a correlation matrix or on a covariance matrix of normalized data), both are data reduction techniques that help to reduced the dimensionality of the data and outputs will look very much the same. Despite all these similarities, they solve a different problem: principal component analysis (PCA) is a linear combination of variables (so that the principal components (PCs) are orthogonal); factor analysis is a measurement model of a latent variable.

20.1 Principle Components Analysis (PCA)

PCA is a data reduction technique that calculates new variables from the set of the measured variables. These new variables are linear combinations (think of it as a weighted average) of those measured variables (columns). The index-variables that result from this process are called “components.”

The process relies on finding eigenvalues and their eigenvectors and – unless the covariance matrix is singular – this results in as many variables as the datasets has measured variables. However, a PCA will also give us the tools to decide how much of those components are really necessary. So we can find an optimal number of components, which implies an optimal choice of measured variables for each component, and their optimal weights in those principal ...

Get The Big R-Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.