Open Source MLOps in 4 Weeks
Published by O'Reilly Media, Inc.
Building end-to-end ML applications
In this course you’ll:
- Identify steps to launch new machine learning projects
- Apply data management and versioning techniques and tools
- Recognize the significance of ML pipelines and learn how to build one
Training a well-performing machine learning model is rarely the hardest part of an ML project's lifecycle; the hardest part is incorporating this model into a production-ready application that adds value to the customers and business. Data management and versioning are common challenges, as are building reproducible ML pipelines, configuring CI/CD workflows, and monitoring the deployed model in the production environment.
Join expert Alex Kim to get a handle on these challenges by building an end-to-end machine learning application, starting with a proof of concept in a Jupyter notebook and progressing through a robust ML pipeline and, eventually, a web application deployed to a cloud platform. You’ll learn about CI/CD tools and processes and how they can be applied to ML projects and understand how to maintain an ML project in production and implement data drift monitoring. Along the way, you’ll dip into a toolbox that includes Git, DVC, CML, Docker, AWS, GitHub Actions, FastAPI, Heroku, and more.
Along the way, you’ll dip into a toolbox that includes Git, DVC, CML, Docker, AWS, GitHub Actions, FastAPI, and more.
NOTE: With today’s registration, you’ll be signed up for all four weeks. Although you can attend any of the sessions individually, we recommend participating in all four weeks.
What you’ll learn and how you can apply it
Week 1: Kick-Starting an ML Project
- Apply best practices for establishing ML project structure and dependencies management
- Manage project dependencies with pip and virtualenv
- Version datasets with DVC
Week 2: ML Pipelines and Reproducibility
- Refactor a Jupyter notebook into a reproducible ML pipeline
- Version artifacts of an ML pipeline in a remote storage
- Iterate over a large number of ML experiments in a disciplined way
Week 3: CI/CD for ML and ML-based Web API
- Learn the basics of CI/CD
- Leverage the power of CI/CD tools for ML projects with CML
- Integrate an ML model into the FastAPI framework
- Build and test a Docker container running a web API service
- Deploy the resulting Docker container to cloud
Week 4: Data Drift Monitoring for ML Projects
- Distinguish between application monitoring and ML monitoring
- Use Alibi Detect framework to detect data drift
This live event is for you because...
- You’re a recent STEM or coding bootcamp graduate, a practicing data scientist, or a software engineer who is involved in bringing ML models to production.
- You work with Python, Linux, and cloud technologies.
- You want to become an ML/MLOps engineer or a more versatile software engineer.
Prerequisites
- A computer with GitHub, AWS, and Gitpod.io accounts set up
- Familiarity with Python, Linux, and Docker
- An understanding of basic ML concepts (types of ML problems, ML model evaluation)
Recommended preparation:
- Explore Introduction to Python: Learn How to Program Today with Python (video course)
- Explore Essential Machine Learning and AI with Python and Jupyter Notebook (video course)
- Read Introducing MLOps (book)
- Take Introduction to Docker and Containers (live online course with Noureddin Sadawi)
- Take Hands-On with AWS S3 (live online course with Rick Crisci)
Recommended follow-up:
- Take Github Actions and GitOps in One Hour (video course)
- Read Designing Machine Learning Systems (book)
- Watch “Model Monitoring Pipelines” (conference talk)
- Read Building Machine Learning Powered Applications (book)
- Read Machine Learning Design Patterns (book)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Week 1: Kick-Starting an ML Project (120 minutes)
- Group discussion: Participants’ experience with ML and MLOps
- Presentation: The ML project lifecycle and MLOps best practices
- Q&A
- Break
- Hands-on exercises: Set up project; explore data versioning with DVC
- Q&A
Week 2: ML Pipelines and Reproducibility (120 minutes)
- Presentation: ML pipelines
- Q&A
- Break
- Hands-on exercises: Build and run an ML pipeline; explore ML experiment management
- Q&A
Week 3: CI/CD and Monitoring for ML-Based Web API (120 minutes)
- Presentation: CI/CD and CML
- Hands-on exercises: Configure GitHub Actions, create ML-focused CI/CD workflow, integrate an ML model into a web API service
- Q&A
- Break
- Hands-on exercises: Create a Docker container for the web API service; deploy to Fly.io
Week 4: Data Drift Monitoring for ML Projects (120 minutes)
- Presentation: ML monitoring
- Hands-on exercises: Introduction to Alibi Detect
- Q&A
- Break
- Hands-on exercises: Incorporate Alibi Detect into DVC pipeline; update web app and CI/CD workflows to store data drift metrics
- Q&A
Your Instructor
Alex Kim
Alex Kim is an independent consultant who works on data science and machine learning problems. He also consults with EdTech companies and universities to help ensure that their DS and ML curricula are up-to-date to meet job-market demands and deliver the best learning experience to students. His background is in physics and software engineering. More about his background and professional interests: https://alex000kim.com/about