Skip to Content
Machine Learning for Finance
book

Machine Learning for Finance

by James Le, Jannes Klaas
May 2019
Intermediate to advanced
456 pages
11h 38m
English
Packt Publishing
Content preview from Machine Learning for Finance

Topic modeling

A final, very useful application of word counting is topic modeling. Given a set of texts, are we able to find clusters of topics? The method to do this is called Latent Dirichlet Allocation (LDA).

Note

Note: The code and data for this section can be found on Kaggle at https://www.kaggle.com/jannesklaas/topic-modeling-with-lda.

While the name is quite a mouth full, the algorithm is a very useful one, so we will look at it step by step. LDA makes the following assumption about how texts are written:

  1. First, a topic distribution is chosen, say 70% machine learning and 30% finance.
  2. Second, the distribution of words for each topic is chosen. For example, the topic "machine learning" might be made up of 20% the word "tensor," 10% the word ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Machine Learning for Finance

Machine Learning for Finance

Aryan Singh
Machine Learning and Data Science Blueprints for Finance

Machine Learning and Data Science Blueprints for Finance

Hariom Tatsat, Sahil Puri, Brad Lookabaugh

Publisher Resources

ISBN: 9781789136364Supplemental Content