Learning Ray

Book description

Get started with Ray, the open source distributed computing framework that simplifies the process of scaling compute-intensive Python workloads. With this practical book, Python programmers, data engineers, and data scientists will learn how to leverage Ray locally and spin up compute clusters. You'll be able to use Ray to structure and run machine learning programs at scale.

Authors Max Pumperla, Edward Oakes, and Richard Liaw show you how to build machine learning applications with Ray. You'll understand how Ray fits into the current landscape of machine learning tools and discover how Ray continues to integrate ever more tightly with these tools. Distributed computation is hard, but by using Ray you'll find it easy to get started.

  • Learn how to build your first distributed applications with Ray Core
  • Conduct hyperparameter optimization with Ray Tune
  • Use the Ray RLlib library for reinforcement learning
  • Manage distributed training with the Ray Train library
  • Use Ray to perform data processing with Ray Datasets
  • Learn how work with Ray Clusters and serve models with Ray Serve
  • Build end-to-end machine learning applications with Ray AIR

Publisher resources

View/Submit Errata

Table of contents

  1. Foreword
  2. Preface
    1. Who Should Read This Book
    2. Goals of This Book
    3. Navigating This Book
    4. How to Use the Code Examples
    5. Conventions Used in This Book
    6. Using Code Examples
    7. O’Reilly Online Learning
    8. How to Contact Us
    9. Acknowledgments
  3. 1. An Overview of Ray
    1. What Is Ray?
      1. What Led to Ray?
      2. Ray’s Design Principles
      3. Three Layers: Core, Libraries, and Ecosystem
    2. A Distributed Computing Framework
    3. A Suite of Data Science Libraries
      1. Ray AIR and the Data Science Workflow
      2. Data Processing with Ray Datasets
      3. Model Training
      4. Hyperparameter Tuning
      5. Model Serving
    4. A Growing Ecosystem
    5. Summary
  4. 2. Getting Started with Ray Core
    1. An Introduction to Ray Core
      1. A First Example Using the Ray API
      2. An Overview of the Ray Core API
    2. Understanding Ray System Components
      1. Scheduling and Executing Work on a Node
      2. The Head Node
      3. Distributed Scheduling and Execution
    3. A Simple MapReduce Example with Ray
      1. Mapping and Shuffling Document Data
      2. Reducing Word Counts
    4. Summary
  5. 3. Building Your First Distributed Application
    1. Introducing Reinforcement Learning
    2. Setting Up a Simple Maze Problem
    3. Building a Simulation
    4. Training a Reinforcement Learning Model
    5. Building a Distributed Ray App
    6. Recapping RL Terminology
    7. Summary
  6. 4. Reinforcement Learning with Ray RLlib
    1. An Overview of RLlib
    2. Getting Started with RLlib
      1. Building a Gym Environment
      2. Running the RLlib CLI
      3. Using the RLlib Python API
    3. Configuring RLlib Experiments
      1. Resource Configuration
      2. Rollout Worker Configuration
      3. Environment Configuration
    4. Working with RLlib Environments
      1. An Overview of RLlib Environments
      2. Working with Multiple Agents
      3. Working with Policy Servers and Clients
    5. Advanced Concepts
      1. Building an Advanced Environment
      2. Applying Curriculum Learning
      3. Working with Offline Data
      4. Other Advanced Topics
    6. Summary
  7. 5. Hyperparameter Optimization with Ray Tune
    1. Tuning Hyperparameters
      1. Building a Random Search Example with Ray
      2. Why Is HPO Hard?
    2. An Introduction to Tune
      1. How Does Tune Work?
      2. Configuring and Running Tune
    3. Machine Learning with Tune
      1. Using RLlib with Tune
      2. Tuning Keras Models
    4. Summary
  8. 6. Data Processing with Ray
    1. Ray Datasets
      1. Ray Datasets Basics
      2. Computing Over Ray Datasets
      3. Dataset Pipelines
      4. Example: Training Copies of a Classifier in Parallel
    2. External Library Integrations
    3. Building an ML Pipeline
    4. Summary
  9. 7. Distributed Training with Ray Train
    1. The Basics of Distributed Model Training
    2. Introduction to Ray Train by Example
      1. Predicting Big Tips in NYC Taxi Rides
      2. Loading, Preprocessing, and Featurization
      3. Defining a Deep Learning Model
      4. Distributed Training with Ray Train
      5. Distributed Batch Inference
    3. More on Trainers in Ray Train
      1. Migrating to Ray Train with Minimal Code Changes
      2. Scaling Out Trainers
      3. Preprocessing with Ray Train
      4. Integrating Trainers with Ray Tune
      5. Using Callbacks to Monitor Training
    4. Summary
  10. 8. Online Inference with Ray Serve
    1. Key Characteristics of Online Inference
      1. ML Models Are Compute Intensive
      2. ML Models Aren’t Useful in Isolation
    2. An Introduction to Ray Serve
      1. Architectural Overview
      2. Defining a Basic HTTP Endpoint
      3. Scaling and Resource Allocation
      4. Request Batching
      5. Multimodel Inference Graphs
    3. End-to-End Example: Building an NLP-Powered API
      1. Fetching Content and Preprocessing
      2. NLP Models
      3. HTTP Handling and Driver Logic
      4. Putting It All Together
    4. Summary
  11. 9. Ray Clusters
    1. Manually Creating a Ray Cluster
    2. Deployment on Kubernetes
      1. Setting Up Your First KubeRay Cluster
      2. Interacting with the KubeRay Cluster
      3. Exposing KubeRay
      4. Configuring KubeRay
      5. Configuring Logging for KubeRay
    3. Using the Ray Cluster Launcher
      1. Configuring Your Ray Cluster
      2. Using the Cluster Launcher CLI
      3. Interacting with a Ray Cluster
    4. Working with Cloud Clusters
      1. AWS
      2. Using Other Cloud Providers
    5. Autoscaling
    6. Summary
  12. 10. Getting Started with the Ray AI Runtime
    1. Why Use AIR?
    2. Key AIR Concepts by Example
      1. Ray Datasets and Preprocessors
      2. Trainers
      3. Tuners and Checkpoints
      4. Batch Predictors
      5. Deployments
    3. Workloads That Are Suited for AIR
      1. AIR Workload Execution
      2. AIR Memory Management
      3. AIR Failure Model
      4. Autoscaling AIR Workloads
    4. Summary
  13. 11. Ray’s Ecosystem and Beyond
    1. A Growing Ecosystem
      1. Data Loading and Processing
      2. Model Training
      3. Model Serving
      4. Building Custom Integrations
      5. An Overview of Ray’s Integrations
    2. Ray and Other Systems
      1. Distributed Python Frameworks
      2. Ray AIR and the Broader ML Ecosystem
      3. How to Integrate AIR into Your ML Platform
    3. Where to Go from Here?
    4. Summary
  14. Index
  15. About the Authors

Product information

  • Title: Learning Ray
  • Author(s): Max Pumperla, Edward Oakes, Richard Liaw
  • Release date: February 2023
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098117221