O'Reilly logo
live online training icon Live Online training

MLflow First Steps

A platform for managing a complete machine learning lifecycle

Topic: Data
Jules Damji

Machine learning (ML) development brings many new complexities beyond the traditional software development lifecycle. ML developers need to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models.

The open source MLflow platform simplifies the entire ML lifecycle and solves many of these problems. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools and includes a central repository to share models, accelerating the ML lifecycle for organizations of any size.

Join expert Jules Damji to get started with MLflow. You'll explore MLflow’s four main components—MLflow Tracking, MLflow Projects, MLflow Models, and Model Registry—and discover how each helps address challenges of the ML lifecycle. Through a series of hands-on exercises, you’ll learn how to use MLflow in your ML development lifecycle, navigate your way around MLflow documentation and code examples, convert your existing machine learning code to use MLflow APIs, and more.

What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

  • How to use MLflow Tracking to record and query experiments
  • How to use the MLflow Projects packaging format to reproduce runs
  • How to use the MLflow Models general format to send models to diverse deployment tools
  • How to use Model Registry for collaborative model lifecycle management
  • How to use the MLflow UI to visually compare and contrast experimental runs with different tuning parameters and evaluate metrics

And you’ll be able to:

  • Use Databricks Community Edition (DCE)
  • Navigate MLflow Python APIs documentation
  • Visualize and compare experiment metrics, parameters, and runs
  • Build and execute MLflow Projects
  • Understand Model Registry workflows (the Model Registry UI and the Model Registry API)
  • Deploy and serve a model from the Model Registry on the local host

This training course is for you because...

  • You’re a data scientist or machine learning developer.
  • You work with Python and common machine learning frameworks such as Keras, TensorFlow, scikit-learn, or Spark ML.
  • You want to become more productive in managing your ML models and experiments.

Prerequisites

  • A local machine (preferably Unix-based with 8+ GB of memory) with Chrome or Firefox; PyCharm/IntelliJ or your choice of syntax-based Python editor; pip/pip3 or conda; and Python 3 installed and the course notebooks imported and set up
  • A Databricks account (required to take part in hands-on exercises)
  • A GitHub account
  • A working knowledge of Python 3 and conda or pip/pip3
  • Familiarity with machine learning concepts, libraries, and frameworks

Recommended preparation:

Recommended follow-up:

About your instructor

  • Jules S. Damji is a Senior Developer Advocate at Databricks, an MLflow contributor, and O’Reilly co-author of Learning Spark 2nd. He is a hands-on developer with over 20 years of experience and has worked at leading companies, such as Sun Microsystems, Netscape, @Home, Opsware/Loudcloud, VeriSign, ProQuest, and Hortonworks, building large-scale distributed systems. He holds a B.Sc and M.Sc in Computer Science (from Oregon State University and Cal State, Chico respectively), and an MA in Political Advocacy and Communication (from Johns Hopkins University).

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Introduction to MLflow: Concepts and MLflow Tracking (55 minutes)

  • Presentation and hands-on exercises: What is MLflow, and why use it?; how MLflow addresses the ML lifecycle; MLflow APIs docs; using Databricks Community Edition (DCE); MLflow Python Fluent Tracking APIs; walk-through of two machine learning models using MLflow APIs in DCE; using the MLflow UI as part of DCE to compare experiment metrics, parameters, and runs

Break (5 minutes)

Introduction to MLflow: MLflow Projects (55 minutes)

  • Presentation and hands-on exercises: Concepts and motivation behind MLflow Projects; MLflow Project API documentation; executing and reproducing MLflow Projects in DCE; building an MLflow Project and sharing it for reproducible runs; using the MLflow UI on DCE; using the MLflow UI as part of DCE to compare experiment metrics, parameters, and runs

Break (5 minutes)

Introduction to MLflow: MLflow Models (55 minutes)

  • Presentation and hands-on exercises: Concepts and motivation behind MLflow Models; MLflow Model API documentation; creating different Model flavors; Pyfunc Model Flavor—what it is and how to use it; using the MLflow UI on DCE

Break (5 minutes)

Introduction to MLflow: Model Registry (55 minutes)

  • Presentation and hands-on exercises: Concepts and motivation behind the MLflow Model Registry; Model Registry API documentation; Model Registry Workflow—UI and API; creating models and registering them; using Pyfunc Model Flavor to load models from Model Registry; using the Model Registry UI on Jupyter Lab (local host) or Google Colab

Wrap-up and Q&A (5 minutes)