The idea behind LSA is factorizing Mdw so as to extract a set of latent variables (this means that we can assume their existence, but they cannot be observed directly) that work as connectors between the document and terms. As discussed in Chapter 11, Introducing Recommendation Systems, a very common decomposition method is Singular Value Decomposition (SVD):
However, we're not interested in a full decomposition; we are interested only in the subspace defined by the top k singular values:
This approximation has the ...