July 2017
Intermediate to advanced
796 pages
18h 55m
English
PCA, which is used widely in dimensionality reduction, is a statistical method that helps to find the rotation matrix. For example, if we want to check if the first coordinate has the largest variance possible. Also it helps to check if there is any succeeding coordinate that will turn the largest variance possible.
Eventually, the PCA model calculates such parameters and returns them as a rotation matrix. The columns of the rotation matrix are called principal components. Spark MLlib supports PCA for tall and skinny matrices stored in a row-oriented format and any vectors.
Read now
Unlock full access