Essential Math for Data Science Bootcamp—with Interactivity
Published by O'Reilly Media, Inc.
Achieve practical math proficiency using Python
In today’s data-driven world, the ability to analyze and extract insights from data is more valuable than ever. However, the mathematical foundations needed for data science can often feel overwhelming. With countless topics in mathematics, it’s difficult to know which ones are truly relevant and applicable to real-world problems.
This two-day bootcamp, led by data science expert Thomas Nield, is designed to give you a practical, hands-on understanding of essential math concepts—so you can apply them immediately. You’ll explore key topics including probability, statistics, hypothesis testing, linear algebra, and calculus, all through an intuitive and visual approach that emphasizes clarity over complex notation.
You’ll start by building a strong foundation in probability and statistics—exploring concepts like conditional probability, Bayes' theorem, distributions, variance, hypothesis testing, z-tests, and t-tests. From there, you’ll dive into linear algebra, where you’ll develop a deeper intuition for vectors, matrices, and transformations while leveraging Python and NumPy. Finally, you’ll unlock the power of calculus for machine learning, simplifying core concepts like derivatives and integrals and applying them directly with hands-on Python exercises.
By the end of this bootcamp, you’ll have the mathematical confidence needed to bridge the gap between theory and application, using data science, statistics, and machine learning with a solid foundation. Whether you’re starting out or looking to refine your skills, this course will equip you with the key math concepts that truly matter.
Skills challenges
At the end of each week, Thomas Nield will provide you with a skills challenge—an interactive scenario-based evaluation to help you determine whether you’ve mastered the skills taught in the live training and whether you’re ready to apply these skills in a real-world setting.
To reinforce your learning, we strongly recommend pursuing each skills challenge before the next week of the course. If you’re unable to successfully complete the challenge, try reviewing the video recording of the live training (emailed to you 24 hours after each session) for tips.
What you’ll learn and how you can apply it
- How probability works and what it means to measure randomness
- How multiple events can affect the probability of another event
- How to use discrete and normal distributions
- When to add and multiply probabilities of different events
- Understand the relationship between samples and populations
- How to frame experiments and hypothesis testing
- Statistical significance and parameter estimation
- The intuition behind vectors and matrices, both visually and numerically
- Important vector/matrix operators—what they mean, and their transformations
- Linear systems and matrix decomposition
- The nature of functions and how they work, including exponential and logarithmic functions
- The two most fundamental operations in calculus: the derivative and the slope
- What Euler’s number e is and how it’s derived
And you’ll be able to:
- Recognize situations where Bayes' theorem applies
- Leverage Python to create continuous distributions
- Leverage statistical significance to test hypotheses
- Quantify uncertainty in experiment results
- Determine confidence intervals and perform A/B testing
- Appreciate vectors and matrices beyond just a grid of numbers, and visualize operations
- Construct a system of linear equations and solve its variables
- Using Python’s NumPy package to perform linear algebra operations
- Calculate the slope or area for any part of a function, using from-scratch code in Python
- Intuitively derive Euler’s number e and other critical functions you’ll encounter in data science
- Apply calculus reasoning to other areas like probability, statistics, and machine learning
This live event is for you because...
- You’re a budding data science professional who wants to build foundational knowledge in essential math concepts and understand how they apply to probability, statistics, and machine learning.
- You’re a programmer using data science and machine learning libraries and want to understand the math and probability principles behind them.
- You’re managing a data science team and want to have a fundamental understanding of techniques used on the field.
Prerequisites
- A basic understanding of algebra and isolating variables in an equation
- Basic Python proficiency to follow code examples
Recommended preparation
- A computer with a Python 3 environment set up for running code examples, or the O'Reilly interactive Python sandbox in a browser
Recommended follow-up
- Read first 4 chapters of Essential Math for Data Science (book by Thomas Nield)
- Take Machine Learning from Scratch (live online training course with Thomas Nield)
- Read Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, second edition (book)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Day 1: Probability
Introduction and getting started (10 minutes)
- Discussion: Monty Hall Problem
- Presentation: What is probability?
Understanding probability (10 minutes)
- Presentation: Frequentist and Bayesian probability; odds ratios
- Hands-on exercise: Want to make a bet?
Adding and multiplying probabilities (15 minutes)
- Presentation: Joint probability; union probability
- Hands-on exercises: Rain and joint probability; rain and union probability
Conditional probability (20 minutes)
- Presentation: Conditional probability and colorblindness
- Hands-on exercise: Rain and conditional probability
- Break
Bayes' theorem (20 minutes)
- Presentation: Violence and video games
- Hands-on exercise: Medical testing accuracy
Binomial distribution (10 minutes)
- Presentation: Binomial distribution
- Hands-on exercise: Airline empty seats
Normal distribution (15 minutes)
- Presentation: Normal distribution; quantile functions
- Hands-on exercise: Predict life expectancy of a phone
Beta distribution (15 minutes)
- Presentation: Beta distribution and probability of probabilities
- Hands-on exercise: Unfair coin flip
Other distributions and closing (5 minutes)
- Presentation: Other distributions (Poisson, exponential)
- Closing and final Q&A
Day 1: Statistics and Hypothesis Testing
Getting started (5 minutes)
- Presentation: What to expect; why learn statistics and hypothesis testing?
The basics (10 minutes)
- Presentation: Mean and standard deviation; the normal distribution
Central limit theorem (10 minutes)
- Presentation: Discovering central limit theorem
- Hands-on exercises: Average die rolls; average samples from uniform distribution
Population and sample sizes (15 minutes)
- Presentation: Populations versus samples; standard deviation of the mean
- Hands-on exercises: Sample golden retriever weights; how much is enough?
P-values and z-tests (20 minutes)
- Presentation: Discovering the p-value; standard normal distribution and the Z-test
- Hands-on exercise: The tea party problem
- Discussion: What could go wrong using a p-value?
- Break
Confidence intervals (10 minutes)
- Presentation: What is a confidence interval?
- Activity: Calculate confidence interval for the mean of unionized salary
T-distribution (10 minutes)
- Presentation: T-distribution versus the normal distribution
- Hands-on exercise: Sample size effect on t-distribution
T-tests (15 minutes)
- Presentation: Discovering the t-test
- Hands-on exercises: Is my golden retriever underweight? Is my golden retriever overweight?
Paired and 2-sample t-tests (15 minutes)
- Presentation: Extending the t-test
- Hands-on exercises: Does this diet work? Did a pricing change increase sales?
Q&A and closing (10 minutes)
Day 2: Linear Algebra
Getting started (5 minutes)
- Presentation: What is linear algebra? Why learn linear algebra?
Vectors, combining, and scaling (25 minutes)
- Presentation: What are vectors? Combining and scaling vectors; span and linear dependence
- Hands-on exercises: Add and scale vectors in NumPy; add and scale vectors
Transforming vectors and matrices (30 minutes)
- Presentation: Basis vectors, matrices, the determinant
- Hands-on exercises: Matrices and the determinant in NumPy; transform a vector
System of linear equations and inverse matrices (15 minutes)
- Presentation: Solving systems of linear equations with inverse matrices
- Hands-on exercises: Solving systems of linear equations with NumPy; a word problem
Dot products (20 minutes)
- Presentation: Understanding dot products, orthogonality
- Hands-on exercises: Dot products with NumPy; execute a dot product
Matrix decomposition (20 minutes)
- Presentation: Matrix decomposition, eigenvectors, and eigenvalues
- Hands-on exercises: Matrix decomposition with NumPy; decompose a matrix
Final questions and closing (5 minutes)
Day 2: Calculus and Functions
Number theory (5 minutes)
- Presentation: Number theory; natural numbers, integers, rational, and irrational numbers
- Hands-on exercise: Identify numeric types
Mathematical expressions (5 minutes)
- Presentation: Mathematical expressions; order of operations
Mathematical functions (10 minutes)
- Presentation: Intuition behind mathematical functions
- Discussions: Thinking about infinity; linear and nonlinear functions
- Hands-on exercise: How many possible values are there in this function range?
Exponential functions (10 minutes)
- Presentation: Rules for exponents; rational and irrational exponents
- Hands-on exercise: Simplify the exponential expressions
Logarithmic functions (10 minutes)
- Presentation: Rules for logarithms
- Hands-on exercises: Logarithms in Python; evaluate the logarithmic expressions
Euler’s number and natural logarithms (25 minutes)
- Presentation: Euler’s number e; predicting probability of event over time; natural logarithms
- Hands-on exercises: Continuous compounding of interest; evaluate the expressions
- Break
Derivatives (10 minutes)
- Presentation: What is a derivative?
- Hands-on exercises: Discover slopes on a function; calculate the slopes on a function
Partial derivatives (10 minutes)
- Presentation: Partial derivatives; using tools to calculate partial derivatives
Gradient descent (10 minutes)
- Hands-on exercises: Use gradient descent to minimize function; find the minimum
Calculus integrals (20 minutes)
- Presentation: Calculating area under curves
- Hands-on exercises: Create integral function from scratch; what’s the area under this function?
Final Q&A and closing (5 minutes)
Your Instructor
Thomas Nield
Thomas Nield is the founder of Nield Consulting Group and an instructor at University of Southern California, where he teaches AI System Safety, developing systematic approaches for identifying AI-related hazards in aviation and ground vehicles. He's authored three books, including Essential Math for Data Science and Getting Started with SQL (both for O'Reilly). He enjoys making technical content relatable and relevant to those unfamiliar or intimidated by it. Thomas teaches classes on data analysis, machine learning, mathematical optimization, and practical artificial intelligence. He’s also the founder and inventor of Yawman Flight, a company that develops universal handheld flight controls for flight simulation and unmanned aerial vehicles.