O'Reilly logo

Natural Language Processing and Computational Linguistics by Bhargav Srinivasa-Desikan

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Advanced training tips

In Chapter 8, Topic Models, we explored what topic models are, and how to set them up with both Gensim and scikit-learn. But just setting up a topic model isn't sufficient - a poorly trained topic model would not offer us any useful information.

We've already talked about the most important pre-training tip - preprocessing. It would be quite clear now that garbage in is garbage out, but sometimes even after ensuring it isn't garbage you're putting in, we still get nonsense outputs. In this section, we will briefly discuss what else it is you can do to polish your results.

It would be wise to re-look at Chapter 3, SpaCy's Language Model, and Chapter 4, Gensim - Vectorizing Text and Transformations and n-grams, now - ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required