6

Latent Semantic Indexing with Gensim

In Chapter 4, Latent Semantic Indexing with scikit-learn, we learned about the construction of LSI from SVD and used scikit-learn to perform LSI. We also mentioned that the Gensim library has programmed LSI in a few lines of code for production purposes. In this chapter, we will build the LSI model with Gensim. We will also learn how to determine the right number of topics. I’ll also demonstrate to you how to put the model to real use as a search engine. This production-oriented perspective will help data scientists from non-NLP areas to consider stepping into the NLP world.

This chapter covers the following topics:

  • Performing text preprocessing
  • Performing text representation with BoW and TF-IDF
  • Modeling ...

Get The Handbook of NLP with Gensim now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.