Principal component analysis is an unsupervised feature extraction algorithm that takes in m-dimensional input to create a set of n (m >> n) linearly uncorrelated variables (called principal components) in such a way that the n dimensions lose as little variance (or information) as possible due to the loss of the (m-n) dimensions.
The linear transformation in PCA is done in such a way that the first principal component holds the maximum variance (or information). It does so by considering those variables that are highly correlated to each other. Every principal component has more variance than every succeeding component and is orthogonal to the preceding component.
Consider a three-dimensional space where two ...