7. Topic Modeling


In this chapter, we will perform basic cleaning techniques for textual data and then model the cleaned data to derive relevant topics. You will evaluate Latent Dirichlet Allocation (LDA) models and execute non-negative matrix factorization (NMF) models. Finally, you will interpret the results of topic models and identify the best topic model for the given scenario. We will see how topic modeling provides insights into the underlying structure of documents. By the end of this chapter, you will be able to build fully functioning topic models to derive value and insights for your business.


In the last chapter, the discussion focused on preparing data for modeling using dimensionality reduction and autoencoding. ...

Get The Unsupervised Learning Workshop now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.