What are topic models?

We will now make our first foray into probabilistic models and machine learning with text. We did, of course, come across such models earlier on (in Chapter 5, POS-Tagging and Its Applications, Chapter 6, NER-Tagging and Its Applications, and Chapter 7, Dependency Parsing), especially in the way we trained our NER and POS taggers, but our goal in the previous chapters was not to come up with a statistical model involving our text data.

What is a topic model? As the name might suggest, it is a probabilistic model which contains information about topics in the text. We now must ask what exactly a topic is - we can understand a topic as a theme, or underlying ideas represented in text. For example, if we are working with ...

Get Natural Language Processing and Computational Linguistics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.