Chapter 2. Introducing LLMOps

In June 2023, Nvidia CEO Jensen Huang told the world to “get ready for Software 3.0”, in which humans and machines work together to create smart systems that can switch effortlessly between natural language and code. The way we write AI applications has changed. And LLMs aren’t just doing the usual behind-the-scenes algorithmic functions like their older ML-model siblings—they’re changing how we see and interact with software at scale.

It’s a big shift from the Software 2.0 era, where data scientists and ML engineers collected tons of data and extensively feature engineered them to create in-house models to generate predictions and classifications. Now the LLMs are frontend stars, acting as both connectors and orchestrators for the massive information sources around us—both static sources (e.g., documents), and dynamic sources (e.g., websites and APIs).

Operationalizing LLMs

But any tech is only as good as the way you handle it. Without proper implementation and management, even the most advanced technology can falter and LLM-based applications are no exception. If you are doing LLM engineering, you may already be implementing some LLMOps principles, albeit ineffectively.

First, do you have a well-defined strategy encompassing new tools, design patterns, and operational practices to help ensure that everything runs smoothly and reliably and can scale up without having to reengineer the whole infrastructure stack? This approach is called operationalizing ...

Get What Is LLMOps? now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.