Gensim for topic modeling

We used the Gensim library already in Chapter 7, Automatic Text Summarization for extracting keywords and summaries of text. Here we will use it for building a topic model of a collection of texts. Just as we did in earlier chapters, we will practice with a few different types of document collections and see how the results vary.

First, we will build a small test program to make sure that Gensim and LDA are installed correctly and able to generate a topic model from a collection of documents. If Gensim is not loaded into your version of Anaconda, simply run conda install gensim in your terminal.

We begin with importing the Gensim libraries and a PrettyPrinter for formatting:

from gensim import corpora from gensim.models.ldamodel ...

Get Mastering Data Mining with Python – Find patterns hidden in your data now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.