O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Singular value decomposition

SVD seeks to decompose a matrix X of dimension m x n into these three component matrices:

  • U of dimension m x m
  • S, a diagonal matrix of size m x n; the entries of S are referred to as the singular values
  • VT of dimension n x n
X = U * S * V T

Looking at the preceding formula, it appears that we have not reduced the dimensionality of the problem at all, as by multiplying U, S, and V, we reconstruct the original matrix. In practice, the truncated SVD is usually computed. That is, only the top k singular values, which represent the most variation in the data, are kept, while the rest are discarded. The formula to reconstruct X based on the component matrices is then approximate, and is given as follows:

X ~ Uk * ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required