O'Reilly logo
live online training icon Live Online training

Applied Probability Theory for Everyone

From the Basics to Bayesian Analysis, A/B Testing and Markov Chains in Python

Bruno Gonçalves

Recent advances in machine learning and artificial intelligence have resulted in a great deal of attention and interest in these two areas of computer science and mathematics. Most of these advances and developments have relied in stochastic and probabilistic models, requiring a deep understanding of probability theory and how to apply it to each specific situation

In this course, we will cover in a hands-on and incremental fashion the theoretical foundations of probability theory and recent applications such as Markov Chains, Bayesian Analysis, and A/B testing that are commonly used in practical applications in both industry and academia.

What you'll learn-and how you can apply it

  • Probability and Conditional Probability
  • Probability Distributions
  • Likelihood
  • Bayes Theorem
  • Bayesian Statistics
  • Random Walks
  • Markov Chains
  • A/B Testing

This training course is for you because...

The typical audience member will be a data scientist who is interested in mastering the concepts and ideas behind probability and how to apply them to machine learning and AI contexts. The primary audience will be someone that while familiar with Python programming has no previous experience in probabilistic models and wants to take the first grounded steps. A secondary target audience will be people with previous contact with this class of approaches and that wish to get a greater understanding of what's going on “under the hood.” The course will emphasize practical examples and practical applications as well as highlighting the difficulties in applying this class of models.


  • Basic Python
  • Numpy
  • Matplotlib
  • Jupyter

Course Set-up

  • Scientific Python distribution like Anaconda

Recommended Preparation

These resources are optional, but helpful if you need a refresher on Python: - (video) Python Programming Language LiveLessons by David Beazley: https://www.safaribooksonline.com/videos/python-programming-language/9780134217314 - (video) Modern Python LiveLessons: Big Ideas and Little Code in Python by Raymond Hettinger: https://www.safaribooksonline.com/videos/modern-python-livelessons/9780134743400 - Stay connected with Bruno and up-to-date on the world of data, science, and machine learning at https://data4sci.com/newsletter

About your instructor

  • Bruno Gonçalves is currently a Senior Data Scientist working at the intersection of Data Science and Finance. Previously, he was a Data Science fellow at NYU's Center for Data Science while on leave from a tenured faculty position at Aix-Marseille Université. Since completing his PhD in the Physics of Complex Systems in 2008 he has been pursuing the use of Data Science and Machine Learning to study Human Behavior. Using large datasets from Twitter, Wikipedia, web access logs, and Yahoo! Meme he studied how we can observe both large scale and individual human behavior in an obtrusive and widespread manner. The main applications have been to the study of Computational Linguistics, Information Diffusion, Behavioral Change and Epidemic Spreading. In 2015 he was awarded the Complex Systems Society's 2015 Junior Scientific Award for "outstanding contributions in Complex Systems Science" and in 2018 is was named a Science Fellow of the Institute for Scientific Interchange in Turin, Italy.


The timeframes are only estimates and may vary according to how the class is progressing

Segment 1 - Basic Definitions and Intuition Length (50 min)

  • Understand what is a probability
  • Calculate the probability of different outcomes
  • Generate numbers following a specific probability distribution
  • Estimate Population sizes from a sample

Segment 2 - Bayesian Statistics Length (50 min)

  • Understand conditional probabilities
  • Derive Bayes Theorem
  • Understand how to update a belief
  • Break (10m)

Segment 3 - Random Walks and Markov Chains Length (50 min)

  • Simulate a random walk in 1D
  • Understand random walks on networks
  • Define Markov Chains
  • Implement PageRank
  • Break (10m)

Segment 4 - A/B Testing Length (40 min)

  • Understand Hypothesis Testing
  • Measure p-values
  • Compare the likelihood of two outcomes