O'Reilly logo
live online training icon Live Online training

Introduction to Machine Learning for Algorithmic Trading

Train algorithms to discover trading signals using Python

Topic: Business
Deepak Kanungo

In the 20th century, traders and system developers worked together to explicitly formulate all the rules that were executed by their algorithmic trading systems. In the 21st century, financial data scientists are training computer algorithms to discover complex functional relationships from multiple data sources to augment the insights of traders. These ML models are now generating many of the rules used in all aspects of the trading process, from idea generation to execution and portfolio management. ML-based algorithmic trading has contributed significantly to the frenetic pace of automation in the investment management industry, where over 75% of the daily trading in equities is done algorithmically.

Linear models play a pivotal role in modern financial research and practice. These types of models have the longest history in the industry and are seen as the baseline financial model for making inferences and predictions. Furthermore, linear models are intuitive and transparent.

Join expert Deepak Kanungo to dive into supervised linear ML models for regression and classification as you learn the fundamental concepts, processes, and technological tools for applying machine learning models to algorithmic trading strategies.

Note that live trading is out of scope for the course.

What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

  • The benefits and challenges of applying machine learning models to algorithmic trading and investing
  • The various types of machine learning models used in algorithmic trading
  • The concepts, processes, and tools used for researching, designing, and developing ML models
  • How to manage the trade-off between bias and variance in ML models
  • The pitfalls of cross-validation and backtesting when evaluating their performance
  • The paramount importance of domain expertise and feature engineering

And you’ll be able to:

  • Use scikit-learn to analyze, design, and develop linear ML models for regression and classification
  • Leverage the statsmodels library to diagnose the robustness of your ML models
  • Train and test linear ML models for algorithmic trading
  • Evaluate the performance of ML models
  • Fine-tune the hyperparameters of your ML models to improve their performance

This training course is for you because...

  • You’re a retail equity investor, financial analyst, or trader who wants to use machine learning models to discover new trading signals.

Prerequisites

Recommended preparation:

Recommended follow-up:

About your instructor

  • Deepak Kanungo is the founder and CEO of Hedged Capital LLC, an AI-powered trading and advisory firm that uses probabilistic models and technologies. In 2005, Deepak invented a project portfolio management system using Bayesian inference, the foundation of all probabilistic programming languages. Previously, Deepak was a financial advisor at Morgan Stanley, a Silicon Valley fintech entrepreneur, and a director in the Global Planning Department at Mastercard International. He was educated at Princeton University (astrophysics) and the London School of Economics (finance and information systems).

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Types of ML models and the development process (55 minutes)

  • Group discussion: What’s your experience with trading and Python?; ML concepts, benefits, and issues
  • Presentation: The different types of machine learning models used in finance, including supervised, unsupervised, deep learning, and reinforcement learning; the need for domain knowledge to curate data sources; the paramount importance of feature engineering; the trade-off between bias and variance in ML models; the ML development process for algorithmic trading
  • Hands-on exercises: Set up your Colab notebook; create pandas DataFrames to concatenate data from freely available public sources such as FRED (economic), Yahoo (equity), and Quandl (various)
  • Q&A

Break (5 minutes)

Using linear regression models to forecast stock price returns (55 minutes)

  • Presentation: Training an ordinary least squares linear regression model; using lasso and ridge regression to prevent overfitting to noisy financial data
  • Hands-on exercise: Use scikit-learn to train and test three types of linear regression models to predict stock price returns
  • Q&A

Break (5 minutes)

Using linear classification models to predict an economic recession (55 minutes)

  • Presentation: Training a logistic regression model; using the lasso and ridge regularization to prevent overfitting of noisy financial data; how the classifier assigns probabilities
  • Hands-on exercise: Use scikit-learn to train and test the logistic classification models to predict an economic recession
  • Q&A

Break (5 minutes)

Evaluating and improving linear regression and classification ML models (60 minutes)

  • Presentation: Using cross-validation and grid search to improve performance; issues with using cross-validation techniques with financial data; risk-adjusted business, binary classification, and regression performance metrics; backtesting and forward testing algorithms using market data; pitfalls and how to remedy them
  • Hands-on exercise: Use scikit-learn and statsmodels to evaluate and diagnose both types of linear models and fine-tune them to improve their performance
  • Q&A