O'Reilly logo
live online training icon Live Online training

Nonlinear Machine Learning for Algorithmic Trading

Train nonlinear algorithms to discover trading signals using Python

Topic: Data
Deepak Kanungo

Because financial markets are nonlinear in nature, supervised nonlinear machine learning (ML) models for regression and classification offer a useful approach to algorithmic trading. And decision tree (DT)-based learning algorithms are some of the most powerful supervised learning methods available. Indeed, DT-based ensemble learning algorithms, such as random forests (RF) and gradient boosting machines (GBM), consistently win public ML competitions. DT models are easy to understand and visualize but tend to overfit noisy financial data. RF and GBM algorithms reduce this overfitting of data but are harder to interpret and understand.

Join expert Deepak Kanungo to explore the fundamental concepts, process, and technological tools for applying nonlinear machine learning models to algorithmic trading strategies. You’ll get hands-on with tools like pandas and scikit-learn as you learn how to concatenate data and train and test DT, RF, and GBM models to forecast market direction, predict a recession, and more.

Note that live trading is out of scope for this course.

What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

  • The benefits and challenges of applying DT-based machine learning models to algorithmic trading and investing
  • The pros and cons of different DT-based ML models used in algorithmic trading
  • The concepts, processes, and tools used for researching, designing, and developing models
  • The differences between Gini impurity, cross-entropy, and classification error rates
  • How to manage the trade-off between bias and variance in DT-based ML models
  • Bagging methods for evaluating performance
  • The different types of boosting algorithms for improving performance
  • The paramount importance of financial expertise and feature engineering

And you’ll be able to:

  • Use scikit-learn to analyze, design, and develop DT-based nonlinear ML models for regression and classification
  • Visualize the workings of a DT model
  • Use Bagging methods to train RF and GBM models
  • Evaluate model performance using out-of-bag testing
  • Tune hyperparameters of algorithms and boosting to improve performance
  • Visualize the importance of features of your models on predictions

This training course is for you because...

  • You’re a retail equity investor, financial analyst, or trader who wants to use machine learning models to discover new trading signals.

Prerequisites

  • Basic experience trading and investing in equities
  • Familiarity with Python and pandas data frames

Recommended preparation:

Recommended follow-up:

About your instructor

  • Deepak Kanungo is the founder and CEO of Hedged Capital LLC, an AI-powered trading and advisory firm that uses probabilistic models and technologies. In 2005, Deepak invented a project portfolio management system using Bayesian inference, the foundation of all probabilistic programming languages. Previously, Deepak was a financial advisor at Morgan Stanley, a Silicon Valley fintech entrepreneur, and a director in the Global Planning Department at Mastercard International. He was educated at Princeton University (astrophysics) and the London School of Economics (finance and information systems).

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Overview of various types of ML models and the development process (55 minutes)

  • Group discussion: What’s your trading and Python experience?; DT-based ML concepts, benefits and issues
  • Presentation: Overview of DT, RF, and GBM algorithms and how they’re used in finance; overview of the DT-based ML development process for algorithmic trading
  • Hands-on exercise: Create pandas DataFrames to concatenate data from freely available public sources such as FRED (economic), Yahoo (equity), Quandl (various), and Alpha Vantage
  • Q&A

Break (5 minutes)

Using DT classification and regression models to forecast market direction and returns (55 minutes)

  • Presentation: Building, training, and visualizing DT classification models; how the classification metrics are used
  • Hands-on exercise: Use scikit-learn to train and test DT classification models to predict market direction
  • Q&A

Break (5 minutes)

Using RF classification and regression models to predict an economic recession and percentage loss in GDP (55 minutes)

  • Presentation: Building, training, and visualizing RF models; how the classifier assigns probabilities and issues with imbalanced classes
  • Hands-on exercise: Use scikit-learn to train and test the RF classification and regression models to predict an economic recession and percentage loss in GDP
  • Q&A

Break (5 minutes)

Evaluating and improving DT and RF models using GBM (50 minutes)

  • Presentation: Building, training, and testing GBM models; different types of boosting algorithms and learning rates to improve performance and reduce overfitting of noisy financial data
  • Hands-on exercise: Use scikit-learn to train and test GBM models and improve on the above DT and RT models

Wrap-up and Q&A (10 minutes)