In this section, we look at several linear and non-linear learning techniques when it comes to topic modeling. Linear techniques include Latent Semantic Analysis (two approaches - Singular Vector Decomposition and Non-negative Matrix Factorization), probabilistic Latent Semantic Analysis, and Latent Dirichlet Allocation. On the other hand, non-linear techniques include LDA2Vec and the Neural Variational Document Model.
In the case of Latent Semantic Analysis (LSA), topics are discovered by approximating documents into a smaller number of topic vectors. A collection of documents is represented by document-word matrix:
- In its simplest form, the document word matrix consists of raw counts, which is the frequency ...