Essential Math for Data Science in 4 Weeks—with Interactivity
Published by O'Reilly Media, Inc.
Achieve practical math proficiency using Python
With the availability of data, there is a growing demand for talent who can analyze and make sense of it. This makes practical math all the more important because it helps infer insights from data. However, mathematics comprises many topics, and it is hard to identify which ones are applicable and relevant for a data science career. Knowing these essential math topics is key to integrating knowledge across data science, statistics, and machine learning.
With data science expert Thomas Nield, you’ll delve into a carefully curated list of mathematical topics to jumpstart your proficiency in areas of mathematics that you’ll be able to apply immediately. You’ll grasp the fundamentals of probability, statistics, hypothesis testing, linear algebra, and practical calculus. Along the way you’ll integrate what you’ve learned and see practical applications for real-world problems.
Week 1: Probability
Probability is a pillar of data science, machine learning, analytics, and statistics. While measuring random events can seem difficult and abstract, there are practical and intuitive ways to discover probability. You’ll take a tour of foundational concepts in probability, including conditional probability, Bayes' theorem, and distributions. Along the way, you’ll discover these ideas with practical Python code.
Week 2: Statistics and Hypothesis Testing
In this session you’ll explore descriptive metrics like variance and its nuances, and unravel inferential statistics topics, including hypothesis testing, normal distributions, t-distributions, p-values, Z-scores, z-tests, and t-tests, and you’ll apply these concepts with Python code.
Week 3: Linear Algebra
Linear algebra is an enormous and sometimes an intimidating topic, but in this session you’ll explore a more visual and intuitive approach to vectors, matrices, and their transformations—which will help you write readable Python code. With this intuitive visual understanding you’ll never look at NumPy the same way again.
Week 4: Calculus and Functions
In the field of machine learning, a little calculus can go a long way, but you won’t see this method used in academia: you’ll use plain English and avoid expressions crammed with Greek symbols. Starting with an intuitive understanding of numbers, functions, and phenomena, you’ll segue into calculus concepts like derivatives and integrals. And you’ll hack some clever Python code from scratch.
NOTE: With today’s registration, you’ll be signed up for all 4 sessions. Although you can attend any of the sessions individually, we recommend participating in all 4weeks and pursuing the skills challenges in between sessions.
Skills challenges
At the end of each week, Thomas Nield will provide you with a skills challenge—an interactive scenario-based evaluation to help you determine whether you’ve mastered the skills taught in the live training and whether you’re ready to apply these skills in a real-world setting.
To reinforce your learning, we strongly recommend pursuing each skills challenge before the next week of the course. If you’re unable to successfully complete the challenge, try reviewing the video recording of the live training (emailed to you 24 hours after each session) for tips.
What you’ll learn and how you can apply it
By the end of this live, hands-on 4-part series, you’ll understand:
Week 1: Probability
- How probability works and what it means to measure randomness
- How multiple events can affect the probability of another event
- How to use discrete and normal distributions
- When to add and multiply probabilities of different events
Week 2: Statistics and Hypothesis Testing
- Understand the relationship between samples and populations
- How to frame experiments and hypothesis testing
- Statistical significance and parameter estimation
Week 3: Linear Algebra
- The intuition behind vectors and matrices, both visually and numerically
- Important vector/matrix operators—what they mean, and their transformations
- Linear systems and matrix decomposition
Week 4: Calculus and Functions
- The nature of functions and how they work, including exponential and logarithmic functions
- The two most fundamental operations in calculus: the derivative and the slope
- What Euler’s number e is and how it’s derived
And you’ll be able to:
Week 1: Probability
- Recognize situations where Bayes' theorem applies
- Leverage Python to create continuous distributions
Week 2: Statistics and Hypothesis Testing
- Leverage statistical significance to test hypotheses
- Quantify uncertainty in experiment results
- Determine confidence intervals and perform A/B testing
Week 3: Linear Algebra
- Appreciate vectors and matrices beyond just a grid of numbers, and visualize operations
- Construct a system of linear equations and solve its variables
- Using Python’s NumPy package to perform linear algebra operations
Week 4: Calculus and Functions
- Calculate the slope or area for any part of a function, using from-scratch code in Python
- Intuitively derive Euler’s number e and other critical functions you’ll encounter in data science
- Apply calculus reasoning to other areas like probability, statistics, and machine learning
This live event is for you because...
- You’re a budding data science professional who wants to build foundational knowledge in essential math concepts and understand how they apply to probability, statistics, and machine learning.
- You’re a programmer using data science and machine learning libraries and want to understand the math and probability principles behind them.
- You’re managing a data science team and want to have a fundamental understanding of techniques used on the field.
Prerequisites
- A basic understanding of algebra and isolating variables in an equation
- Basic Python proficiency to follow code examples
Recommended preparation
- A computer with a Python 3 environment set up for running code examples, or the O'Reilly interactive Python sandbox in a browser
Recommended follow-up
- Read first 4 chapters of Essential Math for Data Science by Thomas Nield
- Take Machine Learning from Scratch (live online training course with Thomas Nield)
- Read Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, second edition (book)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Week 1: Probability
Introduction and getting started (10 minutes)
- Discussion: Monty Hall Problem
- Presentation: What is probability?
Understanding probability (10 minutes)
- Presentation: Frequentist and Bayesian probability; odds ratios
- Hands-on exercise: Want to make a bet?
Adding and multiplying probabilities (15 minutes)
- Presentation: Joint probability; union probability
- Hands-on exercises: Rain and joint probability; rain and union probability
Conditional probability (20 minutes)
- Presentation: Conditional probability and colorblindness
- Hands-on exercise: Rain and conditional probability
- Break
Bayes' theorem (20 minutes)
- Presentation: Violence and video games
- Hands-on exercise: Medical testing accuracy
Binomial distribution (10 minutes)
- Presentation: Binomial distribution
- Hands-on exercise: Airline empty seats
Normal distribution (15 minutes)
- Presentation: Normal distribution; quantile functions
- Hands-on exercise: Predict life expectancy of a phone
Beta distribution (15 minutes)
- Presentation: Beta distribution and probability of probabilities
- Hands-on exercise: Unfair coin flip
Other distributions and closing (5 minutes)
- Presentation: Other distributions (Poisson, exponential)
- Closing and final Q&A
Week 2: Statistics and Hypothesis Testing
Getting started (5 minutes)
- Presentation: What to expect; why learn statistics and hypothesis testing?
The basics (10 minutes)
- Presentation: Mean and standard deviation; the normal distribution
Central limit theorem (10 minutes)
- Presentation: Discovering central limit theorem
- Hands-on exercises: Average die rolls; average samples from uniform distribution
Population and sample sizes (15 minutes)
- Presentation: Populations versus samples; standard deviation of the mean
- Hands-on exercises: Sample golden retriever weights; how much is enough?
P-values and z-tests (20 minutes)
- Presentation: Discovering the p-value; standard normal distribution and the Z-test
- Hands-on exercise: The tea party problem
- Discussion: What could go wrong using a p-value?
- Break
Confidence intervals (10 minutes)
- Presentation: What is a confidence interval?
- Activity: Calculate confidence interval for the mean of unionized salary
T-distribution (10 minutes)
- Presentation: T-distribution versus the normal distribution
- Hands-on exercise: Sample size effect on t-distribution
T-tests (15 minutes)
- Presentation: Discovering the t-test
- Hands-on exercises: Is my golden retriever underweight? Is my golden retriever overweight?
Paired and 2-sample t-tests (15 minutes)
- Presentation: Extending the t-test
- Hands-on exercises: Does this diet work? Did a pricing change increase sales?
Q&A and closing (10 minutes)
Week 3: Linear Algebra
Getting started (5 minutes)
- Presentation: What is linear algebra? Why learn linear algebra?
Vectors, combining, and scaling (25 minutes)
- Presentation: What are vectors? Combining and scaling vectors; span and linear dependence
- Hands-on exercises: Add and scale vectors in NumPy; add and scale vectors
Transforming vectors and matrices (30 minutes)
- Presentation: Basis vectors, matrices, the determinant
- Hands-on exercises: Matrices and the determinant in NumPy; transform a vector
System of linear equations and inverse matrices (15 minutes)
- Presentation: Solving systems of linear equations with inverse matrices
- Hands-on exercises: Solving systems of linear equations with NumPy; a word problem
Dot products (20 minutes)
- Presentation: Understanding dot products, orthogonality
- Hands-on exercises: Dot products with NumPy; execute a dot product
Matrix decomposition (20 minutes)
- Presentation: Matrix decomposition, eigenvectors, and eigenvalues
- Hands-on exercises: Matrix decomposition with NumPy; decompose a matrix
Final questions and closing (5 minutes)
Week 4: Calculus and Functions
Number theory (5 minutes)
- Presentation: Number theory; natural numbers, integers, rational, and irrational numbers
- Hands-on exercise: Identify numeric types
Mathematical expressions (5 minutes)
- Presentation: Mathematical expressions; order of operations
Mathematical functions (10 minutes)
- Presentation: Intuition behind mathematical functions
- Discussions: Thinking about infinity; linear and nonlinear functions
- Hands-on exercise: How many possible values are there in this function range?
Exponential functions (10 minutes)
- Presentation: Rules for exponents; rational and irrational exponents
- Hands-on exercise: Simplify the exponential expressions
Logarithmic functions (10 minutes)
- Presentation: Rules for logarithms
- Hands-on exercises: Logarithms in Python; evaluate the logarithmic expressions
Euler’s number and natural logarithms (25 minutes)
- Presentation: Euler’s number e; predicting probability of event over time; natural logarithms
- Hands-on exercises: Continuous compounding of interest; evaluate the expressions
- Break
Derivatives (10 minutes)
- Presentation: What is a derivative?
- Hands-on exercises: Discover slopes on a function; calculate the slopes on a function
Partial derivatives (10 minutes)
- Presentation: Partial derivatives; using tools to calculate partial derivatives
Gradient descent (10 minutes)
- Hands-on exercises: Use gradient descent to minimize function; find the minimum
Calculus integrals (20 minutes)
- Presentation: Calculating area under curves
- Hands-on exercises: Create integral function from scratch; what’s the area under this function?
Final Q&A and closing (5 minutes)
Your Instructor
Thomas Nield
Thomas Nield is the founder of Nield Consulting Group and an instructor at University of Southern California, where he teaches AI System Safety, developing systematic approaches for identifying AI-related hazards in aviation and ground vehicles. He's authored three books, including Essential Math for Data Science and Getting Started with SQL (both for O'Reilly). He enjoys making technical content relatable and relevant to those unfamiliar or intimidated by it. Thomas teaches classes on data analysis, machine learning, mathematical optimization, and practical artificial intelligence. He’s also the founder and inventor of Yawman Flight, a company that develops universal handheld flight controls for flight simulation and unmanned aerial vehicles.