Chapter 8. Retrieval-Augmented Generation (RAG)

In Part II of this book, we introduced several approaches to facilitate LLMs to interact with data stores and software tools. In this chapter, we will dive deep into Retrieval-Augmented Generation (RAG), the dominant paradigm for integrating LLMs with external data sources. We will go through the different stages of the RAG pipeline in detail, and explore the various decisions involved in operationalizing RAG, including what kind of data we can retrieve, how to retrieve it, and when to retrieve it. We will highlight how retrieval can help not only during inference but also during fine-tuning and pre-training. Finally, we will discuss situations where RAG may not be the best option, and showcase alternatives.

The need for RAG

As introduced in Chapter 8, RAG is an umbrella term used to describe a variety of techniques for using external data sources to augment the capabilities of ...

Get Designing Large Language Model Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.