Skip to Content
LLM Engineer's Handbook
book

LLM Engineer's Handbook

by Paul Iusztin, Maxime Labonne
October 2024
Intermediate to advanced
522 pages
12h 55m
English
Packt Publishing
Content preview from LLM Engineer's Handbook

9

RAG Inference Pipeline

Back in Chapter 4, we implemented the retrieval-augmented generation (RAG) feature pipeline to populate the vector database (DB). Within the feature pipeline, we gathered data from the data warehouse, cleaned, chunked, and embedded the documents, and, ultimately, loaded them to the vector DB. Thus, at this point, the vector DB is filled with documents and ready to be used for RAG.

Based on the RAG methodology, you can split your software architecture into three modules: one for retrieval, one to augment the prompt, and one to generate the answer. We will follow a similar pattern by implementing a retrieval module to query the vector DB. Within this module, we will implement advanced RAG techniques to optimize the search. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

AI Engineering

AI Engineering

Chip Huyen
AI Engineering

AI Engineering

Chip Huyen
AI Engineering

AI Engineering

Chip Huyen

Publisher Resources

ISBN: 9781836200079Supplemental Content