Principal Component Analysis

Principal component analysis (PCA) is a statistical technique of representing high-dimensional data in a low-dimensional space. PCA is usually used to reduce the dimensionality of data so that the data can be further visualized or analyzed in a low-dimensional space. For example, we may use PCA to represent data records with 100 attribute variables by data records with only 2 or 3 variables. In this chapter, a review of multivariate statistics and matrix algebra is first given to lay the mathematical foundation of PCA. Then, PCA is described and illustrated. A list of software packages that support PCA is provided. Some applications of PCA are given with references.

14.1  Review of Multivariate Statistics ...

Get Data Mining now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.