Chapter 3. Automating the LLM Life Cycle

Developing and deploying an LLM-based application is an iterative endeavor and a practice of constant experimentation that can lead to significant technical debt. Across the model life cycle, LLM teams create several versions of their data, model, and pipeline stack. Not all of those changes get documented. And then there are the tools and dependencies! Across the model pipeline, there are some similarities with operationalizing deterministic ML models (aka MLOps); however, there are several new challenges as well. Namely, where LLMOps differs substantially from MLOps is that these models are generative in nature, which can make evaluating and debugging your model’s performance much harder. Additionally, conventional feature engineering is no longer relevant for large language models and given the large size of these models, a big focus for LLM teams is performance optimization, which includes dealing with data and model parallelism complexity, load imbalance, memory and resource management, etc.

LLMOps helps automate and streamline processes to resolve data communication and synchronization challenges across the entire LLM life cycle so that teams can prototype models quickly, consistently, and effectively in the production environment.

This chapter goes through the eight steps in the LLM life cycle, pictured in Figure 3-1:

  1. Data engineering

  2. Pretraining

  3. Base model selection

  4. Domain adaptation: prompt engineering, RAG, and fine-tuning ...

Get What Is LLMOps? now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.