book

Hands-On Machine Learning for Algorithmic Trading

by Stefan Jansen

December 2018

Beginner to intermediate

684 pages

21h 9m

English

Packt Publishing

Read now

Unlock full access

Content preview from Hands-On Machine Learning for Algorithmic Trading

Using CountVectorizer

The notebook contains an interactive visualization that explores the impact of the min_df and max_df settings on the size of the vocabulary. We read the articles into a DataFrame, set the CountVectorizer to produce binary flags and use all tokens, and call its .fit_transform() method to produce a document-term matrix:

binary_vectorizer = CountVectorizer(max_df=1.0,                                    min_df=1,                                    binary=True)binary_dtm = binary_vectorizer.fit_transform(docs.body)<2225x29275 sparse matrix of type '<class 'numpy.int64'>'   with 445870 stored elements in Compressed Sparse Row format>

The output is a scipy.sparse matrix in row format that efficiently stores of the small share (<0.7%) of 445870 non-zero entries in the 2225 (document) rows and

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Machine Learning for Algorithmic Trading - Second Edition

Stefan Jansen

Machine Learning for Algorithmic Trading Bots with Python

Mustafa Qamar-ud-Din

Probabilistic Machine Learning for Finance and Investing

Deepak K. Kanungo

Machine Learning for Financial Risk Management with Python

Abdullah Karasan

Publisher Resources

ISBN: 9781789346411Supplemental Content