We used the Gensim library already in Chapter 7, Automatic Text Summarization for extracting keywords and summaries of text. Here we will use it for building a topic model of a collection of texts. Just as we did in earlier chapters, we will practice with a few different types of document collections and see how the results vary.
First, we will build a small test program to make sure that Gensim and LDA are installed correctly and able to generate a topic model from a collection of documents. If Gensim is not loaded into your version of Anaconda, simply run
conda install gensim in your terminal.
We begin with importing the Gensim libraries and a PrettyPrinter for formatting:
from gensim import corpora from gensim.models.ldamodel ...