book

RAG with Python Cookbook

Name: RAG with Python Cookbook
Author: Dominik Polzer
ISBN: 9798341600560

by Dominik Polzer

May 2026

Intermediate to advanced

378 pages

8h 17m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Preface
Who This Book Is ForWhat You’ll Learn and How the Book Is OrganizedConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Getting Started with RAG
1.1. Identifying High-Value RAG Use Cases for Your Organization1.2. Choosing Your IDE and Coding Agent Setup1.3. Getting Started with Jupyter Notebooks in VS Code1.4. Storing Secrets and API Keys with .env Files1.5. Building Your First RAG App1.6. Choosing the Frameworks and Libraries for Your RAG Applications1.7. Running the Code Examples in the Book Repository
2. Foundation Models
2.1. Defining a Suitable Prompt Template2.2. Selecting the Right Language Model for Your Task2.3. Generating Content with the OpenAI API2.4. Generating Content with Google’s Gemini Models2.5. Generating Content with the Anthropic API2.6. Running Open Source Models Locally with Ollama2.7. Creating Structured Outputs with the OpenAI SDK and Pydantic
3. Loading Data
3.1. Loading Word Files in Python3.2. Loading PDF Files3.3. Loading and Handling Tabular Data from Excel and CSV Files3.4. Loading Structured Data from a PostgreSQL Database3.5. Loading Audio Files via Speech-to-Text Models3.6. Extracting Text from Images and PDFs via Tesseract OCR3.7. Extracting Text from Images via Multimodal Models3.8. Generating Text Description for Images via Multimodal Models3.9. Generating Text Summaries for Embedded Tables via Multimodal Models3.10. Parsing PDFs with Multimodal Content3.11. Loading Videos via Speech-to-Text and Multimodal Models
4. Data Preparation
4.1. Adding Metadata to Enable Metadata Filtering4.2. Enhancing Data Quality by Replacing Abbreviations and Technical Terms4.3. Improving Search Accuracy by Creating Hypothetical Questions for Text Chunks4.4. Splitting Documents via Character Splitting4.5. Splitting Documents with Recursive Text Splitters4.6. Chunking Documents with Document-Aware Splitting4.7. Splitting Text with Semantic-Aware Chunkers4.8. Splitting Text with Agentic Chunkers
5. Embeddings
5.1. Mapping the Linguistic Meaning of Text Chunks to a Numerical Representation5.2. Visualizing Semantic Relationships Between Text Chunks via Dimensionality Reduction Techniques5.3. Calculating the Distance Between Embeddings5.4. Choosing the Right Embedding Model5.5. Generating Embeddings for Images and Text with CLIP5.6. Performing Text Classification with Embeddings5.7. Improving Search Results with a Hybrid Search Approach
6. Vector Databases and Similarity Searches
6.1. Choosing the Right Vector Database6.2. Storing and Searching Embeddings with FAISS6.3. Storing and Working with Embeddings in a Chroma Vector Database6.4. Storing Embeddings in PostgreSQL with the pgvector Extension6.5. Performing Similarity Search in PostgreSQL6.6. Accelerating Vector Searches in PostgreSQL with Indexing Techniques6.7. Combining Keyword and Similarity Search to Improve Retrieval Accuracy with PostgreSQL
7. Retrieval
7.1. Optimizing Query Results via Metadata Filtering in PostgreSQL7.2. Enhancing Retrieval Accuracy with HyDE7.3. Improving Search Results with Multiquery Retrieval7.4. Addressing Complex Requests by Designing a Query Routing System7.5. Enhancing Retrieved Documents by Designing an Auto-Merging Retriever7.6. Retrieving More Complete Text Chunks with a Sentence Window Retriever7.7. Improving Retrieval Relevancy with Reranking Methods7.8. Decomposing Complex Queries into Multiple Subqueries
8. Agentic RAG
8.1. Designing a Custom Tool in Python8.2. Using Workflow Patterns in Multiagent Systems8.3. Choosing an Agentic Framework8.4. Building an Agentic System via Function Calling8.5. Accelerating Agents with asyncio8.6. Building a Sales Negotiation Agent with OpenAI’s Agents SDK and Chroma8.7. Enriching Your Agent’s Capabilities with MCP Tools8.8. Building an Agentic System with LangGraph
9. Graph RAG
9.1. Creating Your First Neo4j Knowledge Graph and Feeding It with Text from Documents9.2. Extending the Knowledge Graph with Structured Data9.3. Building Your First Cypher Query9.4. Enabling Semantic Search on a Neo4j Knowledge Graph9.5. Optimizing the Knowledge Graph for RAG Systems

10. Evaluating RAG Systems
10.1. Choosing the Right Evaluation Metrics for RAG Systems10.2. Evaluating RAG Systems by Humans10.3. Creating Synthetic Data for Automated Testing10.4. Evaluating the Retriever Step by Calculating Context Precision@k10.5. Evaluating Faithfulness During Generation with LLM-as-a-Judge10.6. Evaluating the Response Relevancy of Your RAG System
11. RAG Web Apps
11.1. Building Your First Streamlit App11.2. Building a Chatbot App with Streamlit11.3. Adding PDF Analyzer Functionality to Your Chatbot11.4. Connecting Your RAG App to a SQL Database11.5. Deploying Your Streamlit App with Docker and AWS
Index
About the Author

Content preview from RAG with Python Cookbook

Chapter 9. Graph RAG

Graph RAG extends basic retrieval with graph traversal, which lets you move through networks of entities and relationships instead of relying only on semantic similarity of isolated text embeddings.

Basic RAG systems split documents into chunks, embed them, and rely on vector search to surface relevant content. Vector search treats each chunk as an isolated unit, with no awareness of how pieces connect across a broader narrative. When information is spread across multiple parts of long documents or requires data from multiple sources, this approach misses relevant context. Dependencies, references, and relationships across sections get lost.

Graph RAG closes this gap. Instead of storing text only as embeddings, it extracts entities, forms explicit relationships between them, and combines this graph structure with the vector index. This produces context that is richer and more precise, and preserves explicit connections between entities.

Figure 9-1 uses contract document data to illustrate the difference between basic RAG and graph RAG. In a basic RAG setup, the system maintains a pool of isolated embedding vectors. In graph RAG, every text snippet is anchored to its surrounding structure. Each clause is linked to its clause type, its company, its address, and the service-level agreement (SLA) it originates from. This structural context helps the model not only find relevant text but also truly understand where it belongs.

Figure 9-1. Graph RAG versus classic ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9798341600553Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

RAG with Python Cookbook

by Dominik Polzer

Chapter 9. Graph RAG

Figure 9-1. Graph RAG versus classic ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.