book

A Simple Guide to Retrieval Augmented Generation

Name: A Simple Guide to Retrieval Augmented Generation
Author: Abhinav Kimothi
ISBN: 9781633435858

by Abhinav Kimothi

June 2025

Beginner to intermediate

256 pages

7h 15m

English

Manning Publications

Read now

Unlock full access

A Simple Guide to Retrieval Augmented Generation
copyright
contents
dedication
preface
acknowledgments
about this book
about the author
about the cover illustration
Part 1 Foundations

1 LLMs and the need for RAG
1.1 Curse of the LLMs and the idea of RAG1.1.1 LLMs are not trained for facts1.1.2 What is RAG?1.2 The novelty of RAG1.2.1 The RAG discovery1.2.2 How does RAG help?1.3 Popular RAG use cases1.3.1 Search Engine Experience1.3.2 Personalized marketing content generation1.3.3 Real-time event commentary1.3.4 Conversational agents1.3.5 Document question answering systems1.3.6 Virtual assistants1.3.7 AI-powered research1.3.8 Social media monitoring and sentiment analysis1.3.9 News generation and content curation
2 RAG systems and their design
2.1 What does a RAG system look like?2.2 Design of RAG systems2.3 Indexing pipeline2.4 Generation pipeline2.5 Evaluation and monitoring2.6 The RAGOps Stack2.7 Caching, guardrails, security, and other layers
Part 2 Creating RAG systems
3 Indexing pipeline: Creating a knowledge base for RAG
3.1 Data loading3.2 Data splitting (chunking)3.2.1 Advantages of chunking3.2.2 Chunking process3.2.3 Chunking methods3.2.4 Choosing a chunking strategy3.3 Data conversion (embeddings)3.3.1 What are embeddings?3.3.2 Common pre-trained embeddings models3.3.3 Embeddings use cases3.3.4 How to choose embeddings?3.4 Storage (vector databases)3.4.1 What are vector databases?3.4.2 Types of vector databases3.4.3 Choosing a vector database
4 Generation pipeline: Generating contextual LLM responses
4.1 Generation pipeline overview4.2 Retrieval4.2.1 Progression of retrieval methods4.2.2 Popular retrievers4.2.3 A simple retriever implementation4.3 Augmentation4.3.1 RAG prompt engineering techniques4.3.2 A simple augmentation prompt creation4.4 Generation4.4.1 Categorization of LLMs and suitability for RAG4.4.2 Completing the RAG pipeline: Generation using LLMs
5 RAG evaluation: Accuracy, relevance, and faithfulness
5.1 Key aspects of RAG evaluation5.1.1 Quality scores5.1.2 Required abilities5.2 Evaluation metrics5.2.1 Retrieval metrics5.2.2 RAG-specific metrics5.3 Frameworks5.3.1 RAGAs5.3.2 Automated RAG evaluation system5.4 Benchmarks5.4.1 RGB5.5 Limitations and best practices
Part 3 RAG in production
6 Progression of RAG systems: Naïve, advanced, and modular RAG
6.1 Limitations of naïve RAG6.2 Advanced RAG techniques6.3 Pre-retrieval techniques6.3.1 Index optimization6.3.2 Query optimization6.4 Retrieval strategies6.4.1 Hybrid retrieval6.4.2 Iterative retrieval6.4.3 Recursive retrieval6.4.4 Adaptive retrieval6.5 Post-retrieval techniques6.5.1 Compression6.6 Modular RAG6.6.1 Core modules6.6.2 New modules
7 Evolving RAGOps stack
7.1 The evolving RAGOps stack7.1.1 Critical layers7.1.2 Essential layers7.1.3 Enhancement layers7.2 Production best practices
Part 4 Additional considerations
8 Graph, multimodal, agentic, and other RAG variants
8.1 What are RAG variants, and why do we need them?8.2 Multimodal RAG8.2.1 Data modality8.2.2 Multimodal RAG use cases8.2.3 Multimodal RAG pipelines8.2.4 Challenges and best practices8.3 Knowledge graph RAG8.3.1 Knowledge graphs8.3.2 Knowledge graph RAG use cases8.3.3 Graph RAG approaches8.3.4 Graph RAG pipelines8.3.5 Challenges and best practices8.4 Agentic RAG8.4.1 LLM agents8.4.2 Agentic RAG capabilities8.4.3 Agentic RAG pipelines8.4.4 Challenges and pest practices8.5 Other RAG variants8.5.1 Corrective RAG8.5.2 Speculative RAG8.5.3 Self-reflective (self RAG)8.5.4 RAPTOR
9 RAG development framework and further exploration
9.1 RAG development framework9.1.1 Initiation stage: Defining and scoping the RAG system9.2 Design stage: Layering the RAGOps stack9.2.1 Indexing pipeline design9.2.2 Generation pipeline design9.2.3 Other design considerations9.2.4 Development stage: Building modular RAG pipelines9.2.5 Evaluation stage: Validating and optimizing the RAG system9.2.6 Deployment stage: Launching and scaling the RAG system9.2.7 Maintenance stage: Ensuring reliability and adaptability9.3 Ideas for further exploration9.3.1 Fine-tuning within RAG9.3.2 Long-context windows in LLMs9.3.3 Managed solutions9.3.4 Difficult queries

Content preview from A Simple Guide to Retrieval Augmented Generation

3 Indexing pipeline: Creating a knowledge base for RAG

This chapter covers

Data loading
Text splitting or chunking
Converting text to embeddings
Storing embeddings in vector databases
Examples in Python using LangChain

In chapter 2, we discussed the main components of retrieval-augmented generation (RAG) systems. You may recall that the indexing pipeline creates the knowledge base or the non-parametric memory of RAG applications. An indexing pipeline needs to be set up before the real-time user interaction with the large language model (LLM) can begin.

This chapter elaborates on the four components of the indexing pipeline. We begin by discussing data loading, which involves connecting to the source, extracting files, and parsing text. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Practical Retrieval Augmented Generation (RAG)

Publisher Resources

ISBN: 9781633435858Publisher Support Publisher Website

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

A Simple Guide to Retrieval Augmented Generation

by Abhinav Kimothi