Skip to Content
Machine Learning for Finance
book

Machine Learning for Finance

by James Le, Jannes Klaas
May 2019
Intermediate to advanced
456 pages
11h 38m
English
Packt Publishing
Content preview from Machine Learning for Finance

Bag-of-words

A simple yet effective way of classifying text is to see the text as a bag-of-words. This means that we do not care for the order in which words appear in the text, instead we only care about which words appear in the text.

One of the ways of doing a bag-of-words classification is by simply counting the occurrences of different words from within a text. This is done with a so-called count vector. Each word has an index, and for each text, the value of the count vector at that index is the number of occurrences of the word that belong to the index.

Picture this as an example: the count vector for the text "I see cats and dogs and elephants" could look like this:

i

see

cats

and

dogs

elephants

1

1

1

2

1

1

In reality, count vectors ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Machine Learning for Finance

Machine Learning for Finance

Aryan Singh
Machine Learning and Data Science Blueprints for Finance

Machine Learning and Data Science Blueprints for Finance

Hariom Tatsat, Sahil Puri, Brad Lookabaugh

Publisher Resources

ISBN: 9781789136364Supplemental Content