book

Learning Ray

by Max Pumperla, Edward Oakes, Richard Liaw

February 2023

Beginner

271 pages

7h 15m

English

O'Reilly Media, Inc.

Book available

Read now

Unlock full access

Who Should Read This BookGoals of This BookNavigating This BookHow to Use the Code ExamplesConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
What Is Ray?What Led to Ray?Ray’s Design PrinciplesThree Layers: Core, Libraries, and EcosystemA Distributed Computing FrameworkA Suite of Data Science LibrariesRay AIR and the Data Science WorkflowData Processing with Ray DatasetsModel TrainingHyperparameter TuningModel ServingA Growing EcosystemSummary
An Introduction to Ray CoreA First Example Using the Ray APIAn Overview of the Ray Core APIUnderstanding Ray System ComponentsScheduling and Executing Work on a NodeThe Head NodeDistributed Scheduling and ExecutionA Simple MapReduce Example with RayMapping and Shuffling Document DataReducing Word CountsSummary
Introducing Reinforcement LearningSetting Up a Simple Maze ProblemBuilding a SimulationTraining a Reinforcement Learning ModelBuilding a Distributed Ray AppRecapping RL TerminologySummary
An Overview of RLlibGetting Started with RLlibBuilding a Gym EnvironmentRunning the RLlib CLIUsing the RLlib Python APIConfiguring RLlib ExperimentsResource ConfigurationRollout Worker ConfigurationEnvironment ConfigurationWorking with RLlib EnvironmentsAn Overview of RLlib EnvironmentsWorking with Multiple AgentsWorking with Policy Servers and ClientsAdvanced ConceptsBuilding an Advanced EnvironmentApplying Curriculum LearningWorking with Offline DataOther Advanced TopicsSummary
Tuning HyperparametersBuilding a Random Search Example with RayWhy Is HPO Hard?An Introduction to TuneHow Does Tune Work?Configuring and Running TuneMachine Learning with TuneUsing RLlib with TuneTuning Keras ModelsSummary
Ray DatasetsRay Datasets BasicsComputing Over Ray DatasetsDataset PipelinesExample: Training Copies of a Classifier in ParallelExternal Library IntegrationsBuilding an ML PipelineSummary
The Basics of Distributed Model TrainingIntroduction to Ray Train by ExamplePredicting Big Tips in NYC Taxi RidesLoading, Preprocessing, and FeaturizationDefining a Deep Learning ModelDistributed Training with Ray TrainDistributed Batch InferenceMore on Trainers in Ray TrainMigrating to Ray Train with Minimal Code ChangesScaling Out TrainersPreprocessing with Ray TrainIntegrating Trainers with Ray TuneUsing Callbacks to Monitor TrainingSummary
Key Characteristics of Online InferenceML Models Are Compute IntensiveML Models Aren’t Useful in IsolationAn Introduction to Ray ServeArchitectural OverviewDefining a Basic HTTP EndpointScaling and Resource AllocationRequest BatchingMultimodel Inference GraphsEnd-to-End Example: Building an NLP-Powered APIFetching Content and PreprocessingNLP ModelsHTTP Handling and Driver LogicPutting It All TogetherSummary

Manually Creating a Ray ClusterDeployment on KubernetesSetting Up Your First KubeRay ClusterInteracting with the KubeRay ClusterExposing KubeRayConfiguring KubeRayConfiguring Logging for KubeRayUsing the Ray Cluster LauncherConfiguring Your Ray ClusterUsing the Cluster Launcher CLIInteracting with a Ray ClusterWorking with Cloud ClustersAWSUsing Other Cloud ProvidersAutoscalingSummary
Why Use AIR?Key AIR Concepts by ExampleRay Datasets and PreprocessorsTrainersTuners and CheckpointsBatch PredictorsDeploymentsWorkloads That Are Suited for AIRAIR Workload ExecutionAIR Memory ManagementAIR Failure ModelAutoscaling AIR WorkloadsSummary
A Growing EcosystemData Loading and ProcessingModel TrainingModel ServingBuilding Custom IntegrationsAn Overview of Ray’s IntegrationsRay and Other SystemsDistributed Python FrameworksRay AIR and the Broader ML EcosystemHow to Integrate AIR into Your ML PlatformWhere to Go from Here?Summary

Content preview from Learning Ray

Chapter 5. Hyperparameter Optimization with Ray Tune

In Chapter 4 you learned how to build and run various reinforcement learning experiments. Running such experiments can be expensive, in terms of both compute resources and the time it takes to run them. This expense only gets amplified as you move on to more challenging tasks, since it is unlikely that you can just pick an algorithm out of the box and run it to get a good result. In other words, at some point you’ll need to tune the hyperparameters of your algorithms to get the best results. As we’ll see in this chapter, tuning machine learning models is hard, but Ray Tune is an excellent choice to help you tackle this task.

Ray Tune is a powerful tool for hyperparameter optimization (HPO). Not only does it work in a distributed manner by default (and works in any other Ray library discussed in this book), but it’s also one of the most feature-rich HPO libraries available. To top this off, Tune integrates with some of the most prominent HPO libraries out there, such as Hyperopt, Optuna, and many more. This makes Tune an ideal candidate for distributed HPO experiments, whether you’re coming from other libraries or starting from scratch.

In this chapter we’ll first revisit in a bit more depth why HPO is hard to do and how you could naively implement it yourself with Ray. We then teach you the core concepts of Ray Tune and how you can use it to tune the RLlib models built in the previous chapter. To wrap things up, we’ll also have ...