book

RAG with Python Cookbook

Name: RAG with Python Cookbook
Author: Dominik Polzer
ISBN: 9798341600560

by Dominik Polzer

May 2026

Intermediate to advanced

378 pages

8h 17m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Preface
Who This Book Is ForWhat You’ll Learn and How the Book Is OrganizedConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Getting Started with RAG
1.1. Identifying High-Value RAG Use Cases for Your Organization1.2. Choosing Your IDE and Coding Agent Setup1.3. Getting Started with Jupyter Notebooks in VS Code1.4. Storing Secrets and API Keys with .env Files1.5. Building Your First RAG App1.6. Choosing the Frameworks and Libraries for Your RAG Applications1.7. Running the Code Examples in the Book Repository
2. Foundation Models
2.1. Defining a Suitable Prompt Template2.2. Selecting the Right Language Model for Your Task2.3. Generating Content with the OpenAI API2.4. Generating Content with Google’s Gemini Models2.5. Generating Content with the Anthropic API2.6. Running Open Source Models Locally with Ollama2.7. Creating Structured Outputs with the OpenAI SDK and Pydantic
3. Loading Data
3.1. Loading Word Files in Python3.2. Loading PDF Files3.3. Loading and Handling Tabular Data from Excel and CSV Files3.4. Loading Structured Data from a PostgreSQL Database3.5. Loading Audio Files via Speech-to-Text Models3.6. Extracting Text from Images and PDFs via Tesseract OCR3.7. Extracting Text from Images via Multimodal Models3.8. Generating Text Description for Images via Multimodal Models3.9. Generating Text Summaries for Embedded Tables via Multimodal Models3.10. Parsing PDFs with Multimodal Content3.11. Loading Videos via Speech-to-Text and Multimodal Models
4. Data Preparation
4.1. Adding Metadata to Enable Metadata Filtering4.2. Enhancing Data Quality by Replacing Abbreviations and Technical Terms4.3. Improving Search Accuracy by Creating Hypothetical Questions for Text Chunks4.4. Splitting Documents via Character Splitting4.5. Splitting Documents with Recursive Text Splitters4.6. Chunking Documents with Document-Aware Splitting4.7. Splitting Text with Semantic-Aware Chunkers4.8. Splitting Text with Agentic Chunkers
5. Embeddings
5.1. Mapping the Linguistic Meaning of Text Chunks to a Numerical Representation5.2. Visualizing Semantic Relationships Between Text Chunks via Dimensionality Reduction Techniques5.3. Calculating the Distance Between Embeddings5.4. Choosing the Right Embedding Model5.5. Generating Embeddings for Images and Text with CLIP5.6. Performing Text Classification with Embeddings5.7. Improving Search Results with a Hybrid Search Approach
6. Vector Databases and Similarity Searches
6.1. Choosing the Right Vector Database6.2. Storing and Searching Embeddings with FAISS6.3. Storing and Working with Embeddings in a Chroma Vector Database6.4. Storing Embeddings in PostgreSQL with the pgvector Extension6.5. Performing Similarity Search in PostgreSQL6.6. Accelerating Vector Searches in PostgreSQL with Indexing Techniques6.7. Combining Keyword and Similarity Search to Improve Retrieval Accuracy with PostgreSQL
7. Retrieval
7.1. Optimizing Query Results via Metadata Filtering in PostgreSQL7.2. Enhancing Retrieval Accuracy with HyDE7.3. Improving Search Results with Multiquery Retrieval7.4. Addressing Complex Requests by Designing a Query Routing System7.5. Enhancing Retrieved Documents by Designing an Auto-Merging Retriever7.6. Retrieving More Complete Text Chunks with a Sentence Window Retriever7.7. Improving Retrieval Relevancy with Reranking Methods7.8. Decomposing Complex Queries into Multiple Subqueries
8. Agentic RAG
8.1. Designing a Custom Tool in Python8.2. Using Workflow Patterns in Multiagent Systems8.3. Choosing an Agentic Framework8.4. Building an Agentic System via Function Calling8.5. Accelerating Agents with asyncio8.6. Building a Sales Negotiation Agent with OpenAI’s Agents SDK and Chroma8.7. Enriching Your Agent’s Capabilities with MCP Tools8.8. Building an Agentic System with LangGraph
9. Graph RAG
9.1. Creating Your First Neo4j Knowledge Graph and Feeding It with Text from Documents9.2. Extending the Knowledge Graph with Structured Data9.3. Building Your First Cypher Query9.4. Enabling Semantic Search on a Neo4j Knowledge Graph9.5. Optimizing the Knowledge Graph for RAG Systems

10. Evaluating RAG Systems
10.1. Choosing the Right Evaluation Metrics for RAG Systems10.2. Evaluating RAG Systems by Humans10.3. Creating Synthetic Data for Automated Testing10.4. Evaluating the Retriever Step by Calculating Context Precision@k10.5. Evaluating Faithfulness During Generation with LLM-as-a-Judge10.6. Evaluating the Response Relevancy of Your RAG System
11. RAG Web Apps
11.1. Building Your First Streamlit App11.2. Building a Chatbot App with Streamlit11.3. Adding PDF Analyzer Functionality to Your Chatbot11.4. Connecting Your RAG App to a SQL Database11.5. Deploying Your Streamlit App with Docker and AWS
Index
About the Author

Content preview from RAG with Python Cookbook

Chapter 3. Loading Data

About 80% of enterprise information is unstructured and distributed across presentations, documents, emails, and media files. Figure 3-1 shows some of the common data types.

Pie chart showing distribution of data types in companies, with approximately 80% unstructured data (documents, presentations, emails) and 20% structured data (databases, spreadsheets)

This chapter shows how to turn common data sources into text that you can embed and retrieve. Most RAG retrievers work with text embeddings, so a practical first step is to convert different formats into a consistent text representation.

Figure 3-2 summarizes the main components of a RAG system, showing both the indexing pipeline (loading and processing documents) and the runtime retrieval flow (retrieving relevant chunks and generating answers).

Diagram illustrating the workflow of a RAG system, including the processes of loading and preprocessing documents, generating embeddings, and retrieving and generating answers using large language models.

This chapter also explains the loading process for various document types.

Warning

This book builds core components from scratch to clarify the underlying concepts. In production, orchestration frameworks such as LangChain or LlamaIndex can accelerate development, but they also introduce moving parts like frequent breaking changes, fast-evolving APIs, and additional abstractions.

If you use these frameworks, pin dependency versions, follow their upgrade guides, and isolate framework-specific code behind small adapters. For the most stable foundation, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9798341600553Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

RAG with Python Cookbook

by Dominik Polzer

Chapter 3. Loading Data

Figure 3-1. A common distribution of data in companies

Figure 3-2. The components of a RAG system

Warning

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.