O'Reilly logo
live online training icon Live Online training

Probability with Python: Essential Math for Data Science

enter image description here

Take control of your data by honing your fundamental math skills

Topic: Data
Michael Cullan

Machine learning requires strong foundation in probability. In this course, we solidify that groundwork by reviewing probability concepts such as important distributions, Bayes' Rule, and conditional expectation. We will also discuss how the different probability distributions are related and connected to each other. In addition, we will be learning how to use Python’s stats/probability oriented libraries.

This is the third course in a four-part series focused on essential math topics. These courses are grouped in pairs with this natural progression:

  1. Linear Algebra with Python
  2. Linear Regression with Python


  1. Probability with Python
  2. Statistics and Hypothesis Testing with Python

What you'll learn-and how you can apply it

By the end of this live, hands-on, online course, you’ll understand:

  • The difference between discrete and continuous random variables
  • The difference between common used probability distributions and how they are related
  • The Central Limit Theorem
  • Bayes’ theorem

And you’ll be able to:

  • Choose an appropriate probability distribution for a process you are modeling
  • Apply Bayes’ theorem
  • Use Python libraries to generate probability distribution to sample from

This training course is for you because...

  • You are someone in a technical role but are looking for foundational knowledge to transition into a data scientist position
  • You are someone who is looking to apply data driven decision making in your position
  • You work with data and want to generate insights and analysis with that data
  • You want to become a data analyst or data scientist


  • Prerequisites What prior knowledge or experience is necessary?
  • Basic statistics
  • Basic Python: variable creation, conditional statements, functions, loops

Recommended preparation:

Recommended follow-up:

About your instructor

  • Michael holds a master’s degree in statistics and a bachelor’s degree in mathematics. His academic research areas ranged from computational paleobiology, where he developed software for measuring evidence for disparate evolutionary models based on fossil data, to music and AI, where he assisted in modeling musical data for a jazz improvisation robot.

    In his current work, Michael teaches hands-on courses in data science as well as business-oriented topics in managing data science initiatives at the organizational level. Aside from teaching, he leads internal data science projects for Pragmatic Institute in support of the marketing and operations teams. In his free time, he applies his math and programming skills toward creating code-based visual art and design projects.


The timeframes are only estimates and may vary according to how the class is progressing

Introduction and Getting Started (5 minutes)

  • Introduction to Jupyter Notebook environment

Introduction to Probability and Random Variables (5 minutes)

  • Lecture: What is a random variable?

Statistics of Random Variables I (10 minutes)

  • Lecture: Discrete Random Variables

Statistics of Random Variables II (10 minutes)

  • Lecture: Continuous Random Variables

Statistics of Random Variables III (10 minutes)

  • Lecture: Quantile Functions

Modeling observations with random variables (5 minutes)

  • Lecture: How to choose your distribution
  • Q&A and Discussion (10 minutes)
  • Break (5 minutes)

Bernoulli Trials and Binomial Distribution (10 minutes)

  • Lecture: Independence and conditional Expectations
  • Lecture: Binomial Random Variables
  • Exercise: Adjust distribution parameters to see their effect

Geometric Distribution (10 minutes)

  • Lecture: Memoryless Property
  • Exercise: Adjust distribution parameters to see their effect

Poisson Distribution (5 minutes)

  • Exercise: Adjust distribution parameters to see their effect

Exponential Distribution (5 minutes)

  • Exercise: Adjust distribution parameters to see their effect

Normal Distribution (10 minutes)

  • Lecture: Central Limit Distribution
  • Exercise: Adjust distribution parameters to see their effect

Beta Distribution and Bayes Theorem (15 minutes)

  • Lecture: The Beta Distribution
  • Exercise: Adjust distribution parameters to see their effect
  • Lecture: Bayes Theorem
  • Exercise: Adjusting distribution based on new data

Q&A and Discussion (5 minutes)