O'Reilly logo

Spark Cookbook by Rishi Yadav

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Dimensionality reduction with singular value decomposition

Often, the original dimensions do not represent data in the best way possible. As we saw in PCA, you can, sometimes, project the data to fewer dimensions and still retain most of the useful information.

Sometimes, the best approach is to align dimensions along the features that exhibit most of the variations. This approach helps to eliminate dimensions that are not representative of the data.

Let's look at the following figure again, which shows the best-fit line on two dimensions:

Dimensionality reduction with singular value decomposition

The projection line shows the best approximation of the original data with one dimension. If we take the points ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required