June 2018
Beginner to intermediate
306 pages
7h 42m
English
Let's start with the most popular topic modeling algorithm - latent Dirichlet allocation, or LDA as we called it before. The LDA model was created in 2003 by Blei and others and is described in the paper, Latent Dirichlet Allocation [3].
Like we discussed before, LDA helps us model a corpus based on topic distributions, which are in turn made of word distributions. What exactly is a distribution of words? Gensim lets us understand and use this very easily.
Cells 15 and 16 of the Jupyter notebook let you see this.
ldamodel = LdaModel(corpus=corpus, num_topics=10, id2word=dictionary)
That's how easy it is to create a model - just specify the corpus, the dictionary mapping, and the number of topics we want to use ...
Read now
Unlock full access