Principal component analysis

Recall from previous chapters that problems involving high-dimensional data can be affected by the curse of dimensionality. As the number of dimensions of a dataset increases, the number of samples required for an estimator to generalize increases exponentially. Acquiring such large data may be infeasible in some applications, and learning from large datasets requires more memory and processing power. Furthermore, the sparseness of data often increases with its dimensions. It can become more difficult to detect similar instances in high-dimensional space as all instances are similarly sparse.

PCA also known as the Karhunen-Loeve Transform (KLT), is a technique for finding patterns in high-dimensional data. PCA ...

Get Mastering Machine Learning with scikit-learn - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.