Chapter 2. Introduction to LLMOps
The size and complexity of LLMs’ architecture can make productionizing these models incredibly hard. Productionizing means not just deploying a model but also monitoring it, evaluating it, and optimizing its performance.
There are constantly new challenges. Depending on your application, these may include how to process data, how to store and dynamically adapt prompts, how to monitor user interaction, and—most pressing—how to prevent the model from spreading misinformation or memorizing training data (which can lead it to release personal information). That’s why operationalizing LLMs, which means managing them day-to-day in production, requires a new framework.
LLMOps, as it’s called, is an operational framework for putting LLM applications in production. Although its name and principles are inspired by its older siblings, MLOps and DevOps, LLMOps is significantly more nuanced. The LLMOps framework can help companies reduce technical debt, maintain compliance, deal with LLMs’ dynamic and experimental nature, and minimize operational and reputational risk by avoiding common pitfalls.
This chapter starts by discussing what LLMOps is and how and where it departs from MLOps. We’ll then introduce you to the LLMOps engineer role and where it fits into existing ML teams. From there, we’ll look at how to measure LLMOps readiness within teams, assess your organization’s LLMOps maturity, and identify crucial KPIs for measuring success. Toward the end of ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access