O'Reilly logo
live online training icon Live Online training

Inside unsupervised learning: Generative models and recommender systems

enter image description here

Explore generative models and build movie recommender systems

Ankur Patel

Many industry experts consider unsupervised learning the next frontier in artificial intelligence, one that may hold the key to general artificial intelligence. Conventional supervised learning cannot be applied to unlabeled data—which comprises the majority of the world's data. In these cases, unsupervised learning can help discover meaningful patterns buried deep in unlabeled datasets, patterns that otherwise would be near impossible for humans to uncover.

Join Ankur Patel for a deep dive into generative models, one of the core concepts of unsupervised learning. Machine learning models come in two forms: discriminative and generative. Discriminative models output a deterministic output, whereas generative models output a probabilistic output. In other words, generative models identify the underlying structure in the input data and use this to learn a probabilistic distribution that helps represent and reproduce the original input data. Generative models like these (e.g., generative adversarial networks) are a powerful means to generate synthetic data.

In just 90 minutes, you'll learn how to train a simple type of generative model—a restricted Boltzmann machine—and use it to build a movie recommender system. These types of content recommender systems are widely used in industry today (e.g., shopping recommendations on Amazon, music recommendations on Spotify, movie recommendations on Netflix, video recommendations on YouTube, news recommendations on Facebook, and picture recommendations on Instagram).

What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

  • The difference between discriminative and generative models
  • Why generative models are so powerful
  • The various types of generative models used today

And you’ll be able to:

  • Train a type of generative model (a restricted Boltzmann machine, or RBM)
  • Build a recommender system using RBMs

This training course is for you because...

  • You're a data scientist or engineer who wants to work with unlabeled data.
  • You want to train generative models to solve a business use case.

Prerequisites

  • A working knowledge of Python
  • A basic understanding of machine learning

Recommended preparation:

Recommended follow-up:

About your instructor

  • Ankur A. Patel is the Vice President of Data Science at 7Park Data, a Vista Equity Partners portfolio company. At 7Park Data, Ankur and his data science team use alternative data to build data products for hedge funds and corporations and develop machine learning as a service (MLaaS) for enterprise clients. MLaaS includes natural language processing (NLP), anomaly detection, clustering, and time series prediction. Prior to 7Park Data, Ankur led data science efforts in New York City for Israeli artificial intelligence firm ThetaRay, one of the world's pioneers in applied unsupervised learning.

    Ankur began his career as an analyst at J.P. Morgan, and then became the lead emerging markets sovereign credit trader for Bridgewater Associates, the world's largest global macro hedge fund, and later founded and managed R-Squared Macro, a machine learning-based hedge fund, for five years. A graduate of the Woodrow Wilson School at Princeton University, Ankur is the recipient of the Lieutenant John A. Larkin Memorial Prize.

    He currently resides in Tribeca in New York City but travels extensively internationally.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Introduction to unsupervised learning (10 minutes)

  • Lecture and hands-on exercises: How unsupervised learning fits into the machine learning ecosystem; common problems in machine learning—finding patterns in unlabeled data; generating new data by learning from unlabeled original data

Motivation for generative models (10 minutes)

  • Lecture and hands-on exercises: Why generative models are so powerful; how learning probability distributions helps generate synthetic data; why synthetic data is vital to advancing the field of machine learning

Motivation for recommender systems (10 minutes)

  • Lecture and hands-on exercises: How recommender systems are built; the difference between content-based recommenders and collaborative filtering; why recommender systems are so widely used in industry today
  • Q&A (5 minutes)
  • Break (5 minutes)

Data preparation (10 minutes)

  • Lecture and hands-on exercises: Explore data in a Jupyter notebook; prepare the movie ratings dataset

Generative models and RBMs (15 minutes)

  • Lecture and hands-on exercises: Introduction to generative models; introduction to restricted Boltzmann machines; train baseline movie recommender system

Recommender systems (15 minutes)

  • Lecture and hands-on exercises: Train RBMs on movie ratings datasets; use RBMs to generate movie recommendations; evaluate results against baseline

Wrap-up and Q&A (10 minutes)