Chapter 7. LLMOps
The number of Generative AI (GenAI) use cases has grown rapidly over the past few years, moving well beyond experimentation into applications that deliver tangible business value.
For example, Airbnb completed a large-scale LLM-driven code migration, updating nearly 3,500 React component test files. What was initially estimated to require 1.5 years of manual engineering effort was completed in just six weeks by combining frontier models with robust automation.
Similarly, DoorDash built AutoEval, an LLM-powered, human-in-the-loop system for evaluating the quality of search result pages. Instead of relying on slow and inconsistent human labeling, the system increased evaluation speed by 98% and expanded capacity by a factor of nine. Beyond scalability, it improved alignment with expert judgment and enabled continuous quality monitoring, allowing human experts ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access