O'Reilly logo

Mastering Data Mining with Python – Find patterns hidden in your data by Megan Squire

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Gensim for topic modeling

We used the Gensim library already in Chapter 7, Automatic Text Summarization for extracting keywords and summaries of text. Here we will use it for building a topic model of a collection of texts. Just as we did in earlier chapters, we will practice with a few different types of document collections and see how the results vary.

First, we will build a small test program to make sure that Gensim and LDA are installed correctly and able to generate a topic model from a collection of documents. If Gensim is not loaded into your version of Anaconda, simply run conda install gensim in your terminal.

We begin with importing the Gensim libraries and a PrettyPrinter for formatting:

from gensim import corpora from gensim.models.ldamodel ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required