Kubeflow for Machine Learning

Book description

If you're training a machine learning model but aren't sure how to put it into production, this book will get you there. Kubeflow provides a collection of cloud native tools for different stages of a model's lifecycle, from data exploration, feature preparation, and model training to model serving. This guide helps data scientists build production-grade machine learning implementations with Kubeflow and shows data engineers how to make models scalable and reliable.

Using examples throughout the book, authors Holden Karau, Trevor Grant, Ilan Filonenko, Richard Liu, and Boris Lublinsky explain how to use Kubeflow to train and serve your machine learning models on top of Kubernetes in the cloud or in a development environment on-premises.

  • Understand Kubeflow's design, core components, and the problems it solves
  • Understand the differences between Kubeflow on different cluster types
  • Train models using Kubeflow with popular tools including Scikit-learn, TensorFlow, and Apache Spark
  • Keep your model up to date with Kubeflow Pipelines
  • Understand how to capture model training metadata
  • Explore how to extend Kubeflow with additional open source tools
  • Use hyperparameter tuning for training
  • Learn how to serve your model in production

Table of contents

  1. Foreword
  2. Preface
    1. Our Assumption About You
    2. Your Responsibility as a Practitioner
    3. Conventions Used in This Book
    4. Code Examples
      1. Using Code Examples
    5. O’Reilly Online Learning
    6. How to Contact the Authors
    7. How to Contact Us
    8. Acknowledgments
    9. Grievances
  3. 1. Kubeflow: What It Is and Who It Is For
    1. Model Development Life Cycle
    2. Where Does Kubeflow Fit In?
    3. Why Containerize?
    4. Why Kubernetes?
    5. Kubeflow’s Design and Core Components
      1. Data Exploration with Notebooks
      2. Data/Feature Preparation
      3. Training
      4. Hyperparameter Tuning
      5. Model Validation
      6. Inference/Prediction
      7. Pipelines
      8. Component Overview
    6. Alternatives to Kubeflow
      1. Clipper (RiseLabs)
      2. MLflow (Databricks)
      3. Others
    7. Introducing Our Case Studies
      1. Modified National Institute of Standards and Technology
      2. Mailing List Data
      3. Product Recommender
      4. CT Scans
    8. Conclusion
  4. 2. Hello Kubeflow
    1. Getting Set Up with Kubeflow
      1. Installing Kubeflow and Its Dependencies
      2. Setting Up Local Kubernetes
      3. Setting Up Your Kubeflow Development Environment
      4. Creating Our First Kubeflow Project
    2. Training and Deploying a Model
      1. Training and Monitoring Progress
      2. Test Query
    3. Going Beyond a Local Deployment
    4. Conclusion
  5. 3. Kubeflow Design: Beyond the Basics
    1. Getting Around the Central Dashboard
      1. Notebooks (JupyterHub)
      2. Training Operators
      3. Kubeflow Pipelines
      4. Hyperparameter Tuning
      5. Model Inference
      6. Metadata
      7. Component Summary
    2. Support Components
      1. MinIO
      2. Istio
      3. Knative
      4. Apache Spark
      5. Kubeflow Multiuser Isolation
    3. Conclusion
  6. 4. Kubeflow Pipelines
    1. Getting Started with Pipelines
      1. Exploring the Prepackaged Sample Pipelines
      2. Building a Simple Pipeline in Python
      3. Storing Data Between Steps
    2. Introduction to Kubeflow Pipelines Components
      1. Argo: the Foundation of Pipelines
      2. What Kubeflow Pipelines Adds to Argo Workflow
      3. Building a Pipeline Using Existing Images
      4. Kubeflow Pipeline Components
    3. Advanced Topics in Pipelines
      1. Conditional Execution of Pipeline Stages
      2. Running Pipelines on Schedule
    4. Conclusion
  7. 5. Data and Feature Preparation
    1. Deciding on the Correct Tooling
    2. Local Data and Feature Preparation
      1. Fetching the Data
      2. Data Cleaning: Filtering Out the Junk
      3. Formatting the Data
      4. Feature Preparation
      5. Custom Containers
    3. Distributed Tooling
      1. TensorFlow Extended
      2. Distributed Data Using Apache Spark
      3. Distributed Feature Preparation Using Apache Spark
    4. Putting It Together in a Pipeline
    5. Using an Entire Notebook as a Data Preparation Pipeline Stage
    6. Conclusion
  8. 6. Artifact and Metadata Store
    1. Kubeflow ML Metadata
      1. Programmatic Query
      2. Kubeflow Metadata UI
    2. Using MLflow’s Metadata Tools with Kubeflow
      1. Creating and Deploying an MLflow Tracking Server
      2. Logging Data on Runs
      3. Using the MLflow UI
    3. Conclusion
  9. 7. Training a Machine Learning Model
    1. Building a Recommender with TensorFlow
      1. Getting Started
      2. Starting a New Notebook Session
      3. TensorFlow Training
    2. Deploying a TensorFlow Training Job
    3. Distributed Training
      1. Using GPUs
      2. Using Other Frameworks for Distributed Training
    4. Training a Model Using Scikit-Learn
      1. Starting a New Notebook Session
      2. Data Preparation
      3. Scikit-Learn Training
      4. Explaining the Model
      5. Exporting Model
      6. Integration into Pipelines
    5. Conclusion
  10. 8. Model Inference
    1. Model Serving
      1. Model Serving Requirements
    2. Model Monitoring
      1. Model Accuracy, Drift, and Explainability
      2. Model Monitoring Requirements
    3. Model Updating
      1. Model Updating Requirements
    4. Summary of Inference Requirements
    5. Model Inference in Kubeflow
    6. TensorFlow Serving
      1. Review
    7. Seldon Core
      1. Designing a Seldon Inference Graph
      2. Testing Your Model
      3. Serving Requests
      4. Monitoring Your Models
      5. Review
    8. KFServing
      1. Serverless and the Service Plane
      2. Data Plane
      3. Example Walkthrough
      4. Peeling Back the Underlying Infrastructure
      5. Review
    9. Conclusion
  11. 9. Case Study Using Multiple Tools
    1. The Denoising CT Scans Example
      1. Data Prep with Python
      2. DS-SVD with Apache Spark
      3. Visualization
      4. The CT Scan Denoising Pipeline
    2. Sharing the Pipeline
    3. Conclusion
  12. 10. Hyperparameter Tuning and Automated Machine Learning
    1. AutoML: An Overview
    2. Hyperparameter Tuning with Kubeflow Katib
    3. Katib Concepts
    4. Installing Katib
    5. Running Your First Katib Experiment
      1. Prepping Your Training Code
      2. Configuring an Experiment
      3. Running the Experiment
      4. Katib User Interface
    6. Tuning Distributed Training Jobs
    7. Neural Architecture Search
    8. Advantages of Katib over Other Frameworks
    9. Conclusion
  13. A. Argo Executor Configurations and Trade-Offs
  14. B. Cloud-Specific Tools and Configuration
    1. Google Cloud
      1. TPU-Accelerated Instances
      2. Dataflow for TFX
  15. C. Using Model Serving in Applications
    1. Building Streaming Applications Leveraging Model Serving
      1. Stream Processing Engines and Libraries
      2. Introducing Cloudflow
    2. Building Batch Applications Leveraging Model Serving
  16. Index

Product information

  • Title: Kubeflow for Machine Learning
  • Author(s): Trevor Grant, Holden Karau, Boris Lublinsky, Richard Liu, Ilan Filonenko
  • Release date: October 2020
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492050124