Probabilistic modeling with TensorFlow Probability
Rethinking machine learning
Probabilistic models enable you to easily encode your or your company’s institutional knowledge into the model before you start collecting data, allowing you to make probabilistic inferences automatically from datasets that need not be large or even clean. Unlike many popular machine learning models, such as neural networks, probabilistic models are not black boxes. These models enable you to infer causes from effects in a fairly transparent manner. This is important in heavily regulated industries, such as finance and health care, where you have to explain the basis of your decisions. In addition, the conventional use of maximum likelihood estimates (MLE) in models can lead to costly assessments of risks. It’s imperative that all models quantify the uncertainty inherent in their point estimates so that sound business decisions can be made under uncertainty.
You can quantify the uncertainty in your estimates quite easily using TensorFlow Probability (TFP), one of the most powerful open source probabilistic machine learning libraries. TFP gives you the tools to build and fit complex probabilistic models using a few simple lines of Python code—letting you focus on model building and evaluation while automating the necessary statistical inferences.
In this hands-on four-hour course, Deepak Kanungo, Mike Shwe, and Josh Dillon teach you to use TFP to quantify the uncertainty inherent in all point estimates. Join in to learn how to make realistic probabilistic predictions without making unrealistic assumptions in your models, enabling you to make sound business decisions in the face of uncertainty.
What you'll learn-and how you can apply it
By the end of this live online course, you’ll understand:
- The sources of errors in models
- The hazards of using conventional statistics to quantify uncertainty in estimates
- The benefits of quantifying uncertainty using Bayesian inference
- How to explicitly encode personal and institutional knowledge into your models
- The advantages of using TFP to learn from small datasets
- The concepts behind Bayesian linear regression
- The underlying principles of change point test analysis of your business processes
- State-of-the-art algorithms like Markov chain Monte Carlo (MCMC), No-U-Turn Sampler (NUTS), and automatic differential variational inference (ADVI) at a high level
And you’ll be able to:
- Build probabilistic models in TFP for your business processes
- Use these models to quantify the uncertainty in your company’s cost of capital so that you can make better capital budgeting decisions
- Use these models to estimate the uncertainty around change point tests in your business processes for quality control, intrusion detection, medical diagnostics, spam filtering, and website tracking
- Continually update your estimates based on new data
This training course is for you because...
- You’re an analyst or developer who needs to build probabilistic models that quantify the uncertainty in your estimates or forecasts.
- A basic understanding of probability and statistics (Read “Seeing Theory” for a visual overview.)
- A working knowledge of Python programming
- Set up a free Colaboratory account and create an empty Colab document
- Read “The Golem of Prague” and “Small Worlds and Large Worlds” (chapters 1 and 2 in Statistical Rethinking)
- Read “What Is Probabilistic Programming?” (article)
- Play the simulated Monty Hall game (For context, read “Understanding the Monty Hall Problem.”)
- Read Bayesian Methods For Hackers: Probabilistic Programming And Bayesian Inference (book)
- Watch Deep Dive into Probabilistic Machine Learning (video, 42m)
- Explore “Probabilistic Programming from Scratch 3: Performance and PyMC3” (O’Reilly oriole)
About your instructor
Deepak Kanungo is the founder and CEO of Hedged Capital LLC, an AI-powered trading and advisory firm. Previously, Deepak was a financial advisor at Morgan Stanley, a Silicon Valley fintech entrepreneur and a Director in the Global Planning Department at MasterCard International. Deepak was educated at Princeton University (Astrophysics) and The London School of Economics (Finance and Information Systems). Hedged Capital’s trading algorithms use probabilistic models and technologies such as TFP. In 2005, Deepak invented a project portfolio management system using Bayesian Inference, the foundation of all probabilistic programming languages.
Mike Shwe is the product manager for TensorFlow Probability at Google. He has also held various technical program management positions at Google related to Knowledge Graph. Previously, Mike developed and deployed commercial systems in predictive marketing analytics for CPG companies, probabilistic text classification systems for CRM, and Bayesian diagnostics in medicine and industrial equipment. Mike holds a BS in Symbolic Systems and an MS in Medical Informatics, both from Stanford University.
Josh Dillon is the co-inventor of TFP and Staff Software Engineer at Google. Previously, Josh was in academia and held teaching positions in software engineering at Georgia Technology Institute and Purdue University. Josh has a Ph.D. in Computational Science and Engineering from Georgia Technology Institute, M.S, in Computer and Electrical Engineering from Purdue University, and B.S. in Computer and Electrical Engineering from Michigan Technological University.
The timeframes are only estimates and may vary according to how the class is progressing
Introduction and the Monty Hall problem (15 minutes)
- Group discussion: Introduction; your experience with Python and statistics
- Hands-on exercises: Explore the Monty Hall problem through the online simulator
Epistemic probability (15 minutes)
- Group discussion: Epistemic probability; how it differs from the frequentist view of probability, on which much of conventional statistics is based
Bayesian inference (25 minutes)
- Lecture: The results from the game; why Bayesian inference offers a solution to the apparent paradox; Bayes’s theorem, the fundamental algorithm of all probabilistic programming languages
- Group discussion and Q&A
- Break (5 minutes)
Setup (5 minutes)
- Lecture: A quick validation of the environment; a brief review of the features of Colab notebook
TensorFlow Probability (TFP) (50 minutes)
- Lecture: The basic concepts and declarative commands in Python code used for building probabilistic models in TFP
- Hands-on exercises: Walk through the built-in change point test analysis model in the Colab notebook and analyze its output graphs
- Group discussion and Q&A
- Break (5 minutes)
Statistical analysis (15 minutes)
- Hands-on exercises: Run the built-in market model (MM) that uses standard linear regression with various start and end dates to draw 10 random samples to compute alpha, beta, and sample error of your company’s stock (or a proxy stock if private), including the 95% confidence intervals for all parameters; note your company’s cost of capital and other results in your notebook
Types of modeling errors (15 minutes)
- Group discussion: The sources of errors in models; the imperative need for quantifying uncertainty in your estimates
Confidence intervals (10 minutes)
- Lecture: The conventional meaning of probability; how confidence intervals are actually meant to be used
Quantifying uncertainty (15 minutes)
- Group discussion: Why it’s inappropriate to use confidence intervals to quantify uncertainty in estimates that are not normally distributed
- Break (5 minutes)
TFP algorithms (30 minutes)
- Lecture: The basic concepts behind the Markov chain Monte Carlo (MCMC), No-U-Turn Sampler (NUTS), and automatic differential variational inference (ADVI) algorithms; what problems they’re best suited to address
Bayesian regression (20 minutes)
- Hands-on exercises: Recode the MM model in TFP using Bayesian linear regression; produce credible intervals for your company’s cost of capital and all other relevant parameters
Wrap-up and Q&A (10 minutes)