Principal components and factor analyses

The principal component analysis (PCA) is a well-known undirected method for reducing the number of variables used in further analyses. This is also called dimensionality-reduction. In some projects, you could have hundreds, maybe even thousands of input variables. Using all of them for can input to clustering algorithm could lead to enormous time needed to train the model. However, many of those input variables that might vary together, might have some association.

PCA starts again with the hyperspace, where each input variable defines one axis. PCA searches for a set of new axes, a set of new variables, which should be linearly uncorrelated, called the principal components. The principal components ...

Get Data Science with SQL Server Quick Start Guide now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.