January 2018
Intermediate to advanced
470 pages
11h 9m
English
Topic modeling (TM) is a technique widely used in mining text from a large collection of documents. These topics can then be used to summarize and organize documents that include the topic terms and their relative weights. The dataset that will be used for this project is just in plain unstructured text format.
We will see how effectively we can use the Latent Dirichlet Allocation (LDA) algorithm for finding useful patterns in the data. We will compare other TM algorithms and the scalability power of LDA. In addition, we will utilize Natural Language Processing (NLP) libraries, such as Stanford NLP.
In a nutshell, we will learn the following topics throughout this end-to-end project: ...
Read now
Unlock full access