Intro to Ray and Ray AIR Featuring Large Language Models
Published by O'Reilly Media, Inc.
Easily scale AI workloads with Ray AIR (AI Runtime)
- Learn what Ray is and how it can function as a backbone for distributed computing
- Explore how Ray’s product-style component design simplifies scaling common tasks even with complex language models and pipelines
- Understand where Ray can be extended with custom code or additional integrations
The incredible impact of large language models (LLMs) highlights the successes of data science at scale and the necessity for large-scale data ingestion, model training, tuning, and serving.
Join expert Adam Breindel to explore how Ray 2.x’s AI Runtime simplifies these challenges by providing components like that allow you to scale popular patterns you’re already using. You'll learn how to ingest, model, tune, predict, and serve with just a few lines of code using well-designed Ray AIR APIs.
What you’ll learn and how you can apply it
- Apply components like Ray Data, Ray Train, and Ray Tune to scale elements in your data science lifecycle
- Serve models using Ray Serve no matter where the models are trained
- Explore additional integrations such as Hugging Face and Gradio on Ray
This live event is for you because...
- You’re a data scientist, MLOps engineer, or data engineer.
- You work with data/ML tools like Python, PyTorch, pandas, TensorFlow, or XGBoost.
- You want to become more effective at scaling your own work or designing scalable systems for your organization.
Prerequisites
- Basic familiarity with Python data science tooling and ML concepts (data frames, deep learning, model tuning, etc.)
Recommended preparation:
- Read “NumPy Basics: Arrays and Vectorized Computation” and “Getting Started with pandas” (chapters 4 and 5 in Python for Data Analysis)
- Read “Machine Learning” (chapter 5 of Python Data Science Handbook)
- Read “Introduction to Artificial Neural Networks with Keras” and “Reinforcement Learning” (chapters 10 and 18 in Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, second edition)
- Read “Transformer Models” and “Using Transformers” (chapters 1 and 2 in the Hugging Face NLP Course)
Recommended follow-up:
- Read What Is Ray? (book)
- Read Natural Language Processing with Transformers, revised edition (book)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Introduction (20 minutes)
- Presentation: What is Ray, where did it come from, and why was it created?
- Group discussion: Your use of clustering technologies
Ray and the data science workflow (45 minutes)
- Demonstration: End-to-end with Ray
- Presentation: Ray AIR data access and feature engineering, modeling, and optimization; deployment with Ray Serve
- Hands-on exercise: Publish a service with Ray Serve
- Group discussion: Your machine learning model types/use cases
- Q&A
- Break
Ray architecture and setup (20 minutes)
- Presentation: Head nodes versus worker nodes; object store; GCS; installing locally
- Demonstration: Dashboard
Data and distributed modeling with Ray Train (35 minutes)
- Presentation: Preprocessing language model training data with Ray Data; fine-tuning a LLM model with PyTorch, Hugging Face, and Ray
- Hands-on exercise: Prediction with the trained LLM
- Q&A
- Break
Model optimization with Ray Tune (10 minutes)
- Presentation: Basics of integration; built-in algorithmic support; model/training integration example
- Demonstration: Output via Tensorboard
Reinforcement learning with RLlib (15 minutes)
- Presentation: Quick examples of built-in algorithms; integrating RLlib; basic single-agent environments; OpenAI Gym and the gym.Env API
- Hands-on exercise: Switch algorithms in a RLlib experiment (if time allows)
Ray Serve (25 minutes)
- Presentation: Serving architecture options and trade-offs; pipelines and deployment graphs
- Hands-on exercise: Build and serve an LLM chatbot
Wrap-up and Q&A (10 minutes)
Your Instructor
Adam Breindel
Adam Breindel consults and teaches courses on Apache Spark, data engineering, machine learning, AI, and deep learning. He supports instructional initiatives as a senior instructor at Databricks, has taught classes on Apache Spark and deep learning for O'Reilly, and runs a business helping large firms and startups implement data and ML architectures. Adam’s first full-time job in tech was neural net–based fraud detection, deployed at North America's largest banks back; since then, he's worked with numerous startups, where he’s enjoyed getting to build things like mobile check-in for two of America's five biggest airlines years before the iPhone came out. He’s also worked in entertainment, insurance, and retail banking; on web, embedded, and server apps; and on clustering architectures, APIs, and streaming analytics.