Skip to Content
View all events

Building Memory with Vector Databases and RAG Architecture

Published by O'Reilly Media, Inc.

Intermediate content levelIntermediate

From embeddings to production-grade hybrid search for AI

What you’ll learn and how you can apply it

  • Build and deploy a functional vector database for RAG applications
  • Compare Chroma, Pinecone, and Weaviate and select the right tool for the job
  • Understand the internal indexing algorithms (HNSW, IVF) that make vector search scalable
  • Implement hybrid search (keyword plus semantic) to reduce AI hallucinations and “keyword blindness”

Course description

Move beyond surface-level RAG demos to master the foundation of intelligent retrieval systems: vector databases and memory architecture. Join Sumit Shukla on a deep dive into how vector databases work, moving from embedding strategies to indexing internals through live comparisons of tools like Chroma, Pinecone, and Weaviate. You’ll learn what actually happens when you store and retrieve semantic memory—and how to tune your system for performance, accuracy, and cost.

You’ll explore real-world challenges like scaling to millions of memory records, optimizing hybrid search (combining dense vectors with filters, keywords, and metadata), and designing memory pipelines that persist, retrieve, and update user interactions across sessions. By the end of the course, you’ll understand vector-backed memory not just as a plug-in but as an architectural layer that makes advanced agents and context-rich systems possible.

This live event is for you because...

  • You’re a backend developer or data scientist transitioning into an AI engineering role.
  • You’re a software engineer, machine learning engineer, or technical lead.

Prerequisites

  • Access to a Google account (for Colab)
  • An OpenAI API key or a Hugging Face access token (free)
  • Proficiency in Python
  • Basic understanding of LLMs (prompts, context windows)
  • Knowledge of RAG basics
  • Experience with OpenAI API
  • Understanding of AI frameworks
  • No prior experience with vector DBs required

Recommended preparation:

  • OpenAI API setup

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Understanding vector memory and RAG foundations (60 minutes)

  • Presentation: Why SQL fails at semantic search; visualizing embeddings and vector spaces
  • Group discussion: Where the vector DB fits in the LLM stack and why context windows aren't enough
  • Hands-on exercise: Create embeddings with Python and calculate cosine similarity manually to understand the math
  • Q&A
  • Break

The vector DB showdown (60 minutes)

  • Presentation: Comparing Chroma (local), Pinecone (managed/serverless), and Weaviate (AI native)
  • Hands-on exercise: Implement the same RAG ingestion pipeline in Chroma and Faiss to see the syntax and architectural differences
  • Group discussion: When to use serverless or self-hosted; cost and privacy
  • Break

Indexing algorithms (60 minutes)

  • Presentation: How to search a million vectors in milliseconds; visualizing flat, inverted file (IVF), and hierarchical navigable small world (HNSW) indexing
  • Hands-on exercise: Run a script to insert 10K vectors and time the search speeds of a “brute force” and a HNSW index
  • Q&A
  • Break

Advanced patterns—hybrid search (50 minutes)

  • Presentation: Why “Error 503” fails in pure vector search; introduction to sparse vectors and BM25
  • Hands-on exercise: Use LangChain to combine a sparse retriever (keyword) and a dense retriever (semantic) with reciprocal rank fusion (RRF)

Wrap-up and Q&A (10 minutes)

Your Instructor

  • Sumit Shukla

    Sumit Shukla is an experienced AI practitioner and educator specializing in large language models and enterprise AI infrastructure. As director of AI at Scaletrix.ai, he leads teams building production-grade RAG systems for financial and healthcare clients. With a background in ed tech, Sumit excels at breaking down complex mathematical concepts—like vector spaces and indexing algorithms—into intuitive, hands-on engineering lessons.

Skill covered

Retrieval Augmented Generation (RAG)