Building Reliable RAG Applications: From PoC to Production

Intermediate

Build, optimize, and deploy RAG applications for production

Course outcomes

Understand the evolution of RAG application and apply best practices to enhance performance
Gain practical guidance on transitioning a RAG application from the proof-of-concept stage to a fully operational production environment
Learn how to operationalize RAG applications, focusing on security, observability, and best practices for maintenance

Join expert Sarang Sanjay Kulkarni to transform your retrieval-augmented generation (RAG) applications from proof of concept to production-ready systems. You’ll learn how to make RAG applications reliable, trustworthy, and scalable and understand best practices for evaluation, testing strategies, and observability while incorporating AI agents to solve complex queries. The hands-on approach ensures that you’ll understand not just the foundational theory, but also how to apply it using tools like Python, LangChain, and LangGraph—all with a focus on achieving reliability in LLM-powered systems. By the end of this course, you’ll be equipped with the knowledge and skills to develop, evaluate, secure, and scale RAG applications.

What you’ll learn and how you can apply it

Learn how to transition RAG applications from proof of concept to fully operational systems
Understand best practices for evaluation, testing, and observability, making LLM-powered systems reliable and scalable
Learn advanced optimization techniques like hybrid search and reranking to improve system performance
Incorporate AI agents and orchestrate workflows using LangGraph to build adaptive LLM applications
Gain skills to implement security and observability, preparing for real-world deployment

This live event is for you because...

You’re a developer with a basic understanding of generative AI.
You’re a developer who’s working on a proof-of-concept RAG application and want to deploy it to make it production-ready.
You’re an ML/AI engineer or you want to become one.

Prerequisites

A computer with Python installed and a virtual environment setup with Jupyter notebook installed
Basic knowledge of Python (to follow the coding exercise)
An OpenAI API key (optional, to participate in exercises)
Basic knowledge of using OpenAI API
Some understanding of retrieval-augmented-generation
Experience with prompt engineering

Recommended follow-up:

Read Hands-On RAG for Production (book)
Read RAG with Python Cookbook (book)
Read Designing Large Language Model Applications (book)
Read AI Engineering (book)
View AI Superstream: Retrieval-Augmented Generation (RAG) in Production (video)

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Review of retrieval-augmented generation (RAG) (60 minutes)

Presentation and demo: The foundations of RAG; diving into RAG architecture (semantic search, embeddings, and vector stores); simple demo for embeddings; the “naive” RAG architecture; basic RAG setup; the interplay between retrieval mechanisms and generative models; common issues with naive RAG; implementation challenges and their impact on overall LLM application performance
Hands-on exercise: Create a simple RAG application using Python
Q&A

Break

From PoC to production—enhancing RAG responses (105 minutes)

Presentation: Techniques for improving naive RAG performance; optimization strategies for better retrieval (reranking, hybrid search, and metadata filters); refining RAG implementation with advanced techniques; AI agents and orchestrating complex workflows
Hands-on exercises: Refine the RAG implementation with advanced techniques; evaluate and showcase the improved performance
Group discussion: How can we further improve the performance and reliability of our application?
Q&A

Break

Building trust in LLM applications (75 minutes)

Presentation: Difference from traditional software development; addressing the black box nature of LLMs; necessity of building trust in AI systems; identifying parameters and factors that can affect trustworthiness; evaluation/testing strategy (component-based and end-to-end evaluations); best practices for evaluations; observability for LLM applications; key metrics to measure performance; best practices for continuous monitoring of evaluations, LLM performance, and application performance and behavior; security (jailbreaks, prompt injection); building security guardrails
Q&A

Your Instructor

Sarang Sanjay Kulkarni
Sarang Sanjay Kulkarni leads a healthcare client account at Thoughtworks, spearheading projects using generative AI to develop advanced research assistants designed to expedite drug discovery timelines. He has over 13 years of experience in application development across various domains such as banking, retail, healthcare, and astronomy. His experience spans the entire development stack, equipping him with expertise across multiple roles, including developer, DevOps engineer, data engineer, and AI engineer. Sarang also serves as a trainer and has facilitated numerous developer bootcamps, data engineering programs, and generative AI training, sharing his knowledge and guiding teams through complex projects.

linkedin link search

Skill covered

Retrieval Augmented Generation (RAG)

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Building Reliable RAG Applications: From PoC to Production

Course outcomes

What you’ll learn and how you can apply it

This live event is for you because...

Prerequisites

Recommended follow-up:

Schedule

Break

Break

Your Instructor

Sarang Sanjay Kulkarni

Skill covered