In this section we will build a topic analysis class based on the lda library. As usual, we first import all the needed libraries:
import sysimport numpy as npimport ldaimport jsonimport pandas as pdfrom collections import Counter, OrderedDictimport nltkfrom nltk.corpus import stopwordsfrom itertools import *from sklearn.feature_extraction.text import CountVectorizer
Then, we create a topic_analysis class with 5 class properties: dataframe, vocab, model, vectorizer, and topics:
class topic_analysis(object): """ A class to extract topics and associate all the verbatims with a specific topics Input: dataframe, ...