on-demand course

Machine Learning with Python for Everyone, Part 2: Measuring Models

with Mark Fenner

August 2022

Beginner to intermediate

5h 34m

English

Pearson

Closed Captioning available in English

Watch now

Unlock full access

Includes

Badge

Course outline

Machine Learning with Python for Everyone: Introduction
4m 2s
Topics
32s
1.1 Error, Cost, and Complexity
9m 23s
1.2 Overfitting/Underfitting I: Synthetic Data
6m 42s
1.3 Overfitting/Underfitting II: Varying Model Complexity
14m 40s
1.4 Errors and Costs
4m 49s
1.5 Resampling Techniques
13m 34s
1.6 Cross-Validation
5m 45s
1.7 Leave-One-Out Cross-Validation
4m 20s
1.8 Stratification
5m 59s
1.9 Repeated Train-Test Splits
7m 6s
1.10 Graphical Techniques
10m 8s
1.11 Getting Graphical: Learning and Complexity Curves
11m 46s
1.12 Graphical Cross-Validation
4m 29s
Topics
33s
2.1 Classification Metrics
12m 9s
2.2 Baseline Classifiers and Metrics
8m 28s
2.3 The Confusion Matrix
8m 19s
2.4 Metrics from the Binary Confusion Matrix
18m 17s
2.5 Performance Curves
20m 3s
2.6 Understanding the ROC Curve and AUC
12m 59s
2.7 Comparing Classifiers with ROC and PR Curves
8m 4s
Topics
37s
3.1 Multi-Class Issues
9m 36s
3.2 Multi-class Metric Averages
13m 0s
3.3 Multi-class AUC: One-versus-Rest
6m 26s
3.4 Multi-class AUC: The Hand and Till Method
5m 49s
3.5 More Curves
8m 48s
3.6 Cumulative Response and Lift Curves
5m 49s
3.7 Case Study: A Classifier Comparison
9m 37s
Topics
33s
4.1 Regression Metrics
8m 17s
4.2 Baseline Regressors
6m 18s
4.3 Regression Metrics: Custom Metrics and RMSE
7m 40s
4.4 Understanding the Default Regression Metric R^2
14m 53s
4.5 Errors and Residual Plots
11m 18s
4.6 Standardization
9m 24s
4.7 A Quick Pipeline and Standardization
10m 12s
4.8 Case Study: A Regressor Comparison
12m 44s
Machine Learning with Python for Everyone: Summary
1m 14s

Overview

4 Hours of Video Instruction

Description

Code-along sessions move you from introductory machine learning concepts to concrete code.

Overview

Machine learning is moving from futuristic AI projects to data analysis on your desk. You need to go beyond following along in discussions to coding machine learning tasks. These videos avoid heavy mathematics to focus on how to turn introductory machine learning concepts into concrete code using Python, scikit-learn, and friends.

You will learn about the fundamental metrics used to evaluate general learning systems and specific metrics used in classification and regression. You will learn techniques for getting the most informative learning performance measures out of your data. You will come away with a strong toolbox of numerical and graphical techniques to understand how your learning system will perform on novel data.

About the Instructor

Mark Fenner, PhD, has been teaching computing and mathematics to diverse adult audiences since 1999. His research projects have addressed design, implementation, and performance of machine learning and numerical algorithms, learning systems for security analysis of software repositories and intrusion detection, probabilistic models of protein function, and analysis and visualization of ecological and microscopy data. Mark continues to work across the data science spectrum from C, Fortran, and Python implementation to statistical analysis and visualization. He has delivered training and developed curriculum for Fortune 50 companies, boutique consultancies, and national-level research laboratories. Mark holds a Ph.D. in Computer Science and owns Fenner Training and Consulting, LLC.

Skill Level

Beginner to Intermediate

Learn How To

Recognize underfitting and overfitting with graphical plots.
Make use of resampling techniques like cross-validation to get the most out of your data.
Graphically evaluate the learning performance of learning systems
Compare production learners with baseline models over various classification metrics
Build and evaluate confusion matrices and ROC curves
Apply classification metrics to multi-class learning problems
Develop precision-recall and lift curves for classifiers
Compare production regression techniques with baseline regressors over various regression metrics
Construct residual plots for regressors

Who Should Take This Course

This course is a good fit for anyone that needs to improve their fundamental understanding of machine learning concepts and become familiar with basic machine learning code. You might be a newer data scientist, a data analyst transitioning to the use of machine learning models, a research and development scientist looking to add machine learning techniques to your classical statistical training, or a manager adding data science/machine learning capabilities to your team.

Course Requirements

Students should have a basic understanding of programming in Python (variables, basic control flow, simple scripts). They should also have familiarity with the vocabulary of machine learning (dataset, training set, test set, model), but knowledge about the concepts can be very shallow. They should have a working Python installation that allows you to use scikit-learn and matplotlib.

Lesson Descriptions

Lesson 1: Evaluating Learning Performance

Lesson 1 covers fundamental issues with learning systems and techniques to assess them. In Lesson 1, starts with a discussion of error, cost, and complexity. Then you learn about overfitting and underfitting: these happen when our model, data, and noise in the system interact with each other poorly. To identify these scenarios, we need to make clever use, and even reuse, of our data. We also look at general techniques to graphically view the performance of our model(s) and how they interact with the data.

Lesson 2: Evaluating Classifiers (Part 1)

Lessons 2 and 3 are about specific issues in evaluating classification systems. Lesson 2 begins with a general discussion of classification metrics and then turns to baseline classifiers and metrics. Then the focus is on the confusion matrix and metrics derived from it. The confusion matrix lays out the ways we are right and the ways we are wrong on an outcome-by-outcome basis. Here we focus on the case where we have two outcomes of interest.

Lesson 3: Evaluating Classifiers (Part 2)

Lesson 3 extends the discussion to include cases where we have more than two outcomes of interest. Several approaches to multi-class evaluation are discussed as well as some classification specific graphical techniques: cumulative response and lift curves. The lesson ends with a case study comparison of classifiers.

Lesson 4: Evaluating Regressors

Lesson 4 discusses techniques specific to evaluating regressors. The lesson begins with regression metrics and baseline regressors before turning to various regression metrics. It then covers how to develop custom, user-defined metrics. Next up are graphical evaluation techniques and followed by a quick look at pipelines and standardization. The lesson concludes with a case study comparing several different regression systems.

About Pearson Video Training

Pearson publishes expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. These professional and personal technology videos feature world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, Pearson IT Certification, Prentice Hall, Sams, and Que Topics include IT Certification, Network Security, Cisco Technology, Programming, Web Development, Mobile Development, and more. Learn more about Pearson Video training at http://www.informit.com/video.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Watch now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Basic Statistics and Regression for Machine Learning in Python

Publisher Resources

ISBN: 9780136932604

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills