July 2017
Intermediate to advanced
796 pages
18h 55m
English
PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. A PCA algorithm can be used to project vectors to a low-dimensional space using PCA. Then, based on the reduced feature vectors, an ML model can be trained. The following example shows how to project 6D feature vectors into four-dimensional principal components. Suppose, you have a feature vector as follows:
val data = Array( Vectors.dense(3.5, 2.0, 5.0, 6.3, 5.60, 2.4), Vectors.dense(4.40, 0.10, 3.0, 9.0, 7.0, 8.75), Vectors.dense(3.20, 2.40, 0.0, 6.0, 7.4, 3.34) )
Now let's create a DataFrame from it, as follows:
Read now
Unlock full access