Skip to Content
Machine Learning for Algorithmic Trading - Second Edition
book

Machine Learning for Algorithmic Trading - Second Edition

by Stefan Jansen
July 2020
Beginner to intermediate
820 pages
25h 30m
English
Packt Publishing
Content preview from Machine Learning for Algorithmic Trading - Second Edition

15

Topic Modeling – Summarizing Financial News

In the last chapter, we used the bag-of-words (BOW) model to convert unstructured text data into a numerical format. This model abstracts from word order and represents documents as word vectors, where each entry represents the relevance of a token to the document. The resulting document-term matrix (DTM)—or transposed as the term-document matrix—is useful for comparing documents to each other or a query vector for similarity based on their token content and, therefore, finding the proverbial needle in a haystack. It provides informative features to classify documents, such as in our sentiment analysis examples.

However, this document model produces both high-dimensional data and very sparse data, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Hands-On Machine Learning for Algorithmic Trading

Hands-On Machine Learning for Algorithmic Trading

Stefan Jansen

Publisher Resources

ISBN: 9781839217715Supplemental Content