Building Memory with Vector Databases and RAG Architecture

Intermediate

From embeddings to production-grade hybrid search for AI

What you’ll learn and how you can apply it

Build and deploy a functional vector database for RAG applications
Compare Chroma, Pinecone, and Weaviate and select the right tool for the job
Understand the internal indexing algorithms (HNSW, IVF) that make vector search scalable
Implement hybrid search (keyword plus semantic) to reduce AI hallucinations and “keyword blindness”

Course description

Move beyond surface-level RAG demos to master the foundation of intelligent retrieval systems: vector databases and memory architecture. Join Sumit Shukla on a deep dive into how vector databases work, moving from embedding strategies to indexing internals through live comparisons of tools like Chroma, Pinecone, and Weaviate. You’ll learn what actually happens when you store and retrieve semantic memory—and how to tune your system for performance, accuracy, and cost.

You’ll explore real-world challenges like scaling to millions of memory records, optimizing hybrid search (combining dense vectors with filters, keywords, and metadata), and designing memory pipelines that persist, retrieve, and update user interactions across sessions. By the end of the course, you’ll understand vector-backed memory not just as a plug-in but as an architectural layer that makes advanced agents and context-rich systems possible.

This live event is for you because...

You’re a backend developer or data scientist transitioning into an AI engineering role.
You’re a software engineer, machine learning engineer, or technical lead.

Prerequisites

Access to a Google account (for Colab)
An OpenAI API key or a Hugging Face access token (free)
Proficiency in Python
Basic understanding of LLMs (prompts, context windows)
Knowledge of RAG basics
Experience with OpenAI API
Understanding of AI frameworks
No prior experience with vector DBs required

Recommended preparation:

OpenAI API setup

Recommended follow-up:

Take Building Reliable RAG Applications: From PoC to Production (live online course with Sarang Sanjay Kulkarni)
Take AI Memory Management in Agentic Systems (live online course with Richmond Alake)

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Understanding vector memory and RAG foundations (60 minutes)

Presentation: Why SQL fails at semantic search; visualizing embeddings and vector spaces
Group discussion: Where the vector DB fits in the LLM stack and why context windows aren't enough
Hands-on exercise: Create embeddings with Python and calculate cosine similarity manually to understand the math
Q&A
Break

The vector DB showdown (60 minutes)

Presentation: Comparing Chroma (local), Pinecone (managed/serverless), and Weaviate (AI native)
Hands-on exercise: Implement the same RAG ingestion pipeline in Chroma and Faiss to see the syntax and architectural differences
Group discussion: When to use serverless or self-hosted; cost and privacy
Break

Indexing algorithms (60 minutes)

Presentation: How to search a million vectors in milliseconds; visualizing flat, inverted file (IVF), and hierarchical navigable small world (HNSW) indexing
Hands-on exercise: Run a script to insert 10K vectors and time the search speeds of a “brute force” and a HNSW index
Q&A
Break

Advanced patterns—hybrid search (50 minutes)

Presentation: Why “Error 503” fails in pure vector search; introduction to sparse vectors and BM25
Hands-on exercise: Use LangChain to combine a sparse retriever (keyword) and a dense retriever (semantic) with reciprocal rank fusion (RRF)

Wrap-up and Q&A (10 minutes)

Your Instructor

Sumit Shukla
Sumit Shukla is an experienced AI practitioner and educator specializing in large language models and enterprise AI infrastructure. As director of AI at Scaletrix.ai, he leads teams building production-grade RAG systems for financial and healthcare clients. With a background in ed tech, Sumit excels at breaking down complex mathematical concepts—like vector spaces and indexing algorithms—into intuitive, hands-on engineering lessons.

search

Skill covered

Retrieval Augmented Generation (RAG)

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills