Principal components analysis (PCA)

The principal components analysis transforms an original set of features into a new set of features ordered by decreasing value of their variance. PCA enables the data scientist to select the features that have the most impact on the classification or prediction (features with the higher variance).

The original observations (vectors of feature instance) are transformed into a set of variables with a lower degree of correlation.

Let's consider a model with two features {x, y} and a set of observations {xi, yi} plotted in the following chart:

Principal components analysis (PCA)

Visualization of the principal components for a two-dimensional model

The ...

Get Scala for Machine Learning - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.