Skip to Content
View all events

Building Reliable RAG Applications: From PoC to Production

Published by O'Reilly Media, Inc.

Intermediate content levelIntermediate

Build, optimize, and deploy RAG applications for production

Course outcomes

  • Understand the evolution of RAG application and apply best practices to enhance performance
  • Gain practical guidance on transitioning a RAG application from the proof-of-concept stage to a fully operational production environment
  • Learn how to operationalize RAG applications, focusing on security, observability, and best practices for maintenance

Join expert Sarang Sanjay Kulkarni to transform your retrieval-augmented generation (RAG) applications from proof of concept to production-ready systems. You’ll learn how to make RAG applications reliable, trustworthy, and scalable and understand best practices for evaluation, testing strategies, and observability while incorporating AI agents to solve complex queries. The hands-on approach ensures that you’ll understand not just the foundational theory, but also how to apply it using tools like Python, LangChain, and LangGraph—all with a focus on achieving reliability in LLM-powered systems. By the end of this course, you’ll be equipped with the knowledge and skills to develop, evaluate, secure, and scale RAG applications.

What you’ll learn and how you can apply it

  • Learn how to transition RAG applications from proof of concept to fully operational systems
  • Understand best practices for evaluation, testing, and observability, making LLM-powered systems reliable and scalable
  • Learn advanced optimization techniques like hybrid search and reranking to improve system performance
  • Incorporate AI agents and orchestrate workflows using LangGraph to build adaptive LLM applications
  • Gain skills to implement security and observability, preparing for real-world deployment

This live event is for you because...

  • You’re a developer with a basic understanding of generative AI.
  • You’re a developer who’s working on a proof-of-concept RAG application and want to deploy it to make it production-ready.
  • You’re an ML/AI engineer or you want to become one.

Prerequisites

  • A computer with Python installed and a virtual environment setup with Jupyter notebook installed
  • Basic knowledge of Python (to follow the coding exercise)
  • An OpenAI API key (optional, to participate in exercises)
  • Basic knowledge of using OpenAI API
  • Some understanding of retrieval-augmented-generation
  • Experience with prompt engineering

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Review of retrieval-augmented generation (RAG) (60 minutes)

  • Presentation and demo: The foundations of RAG; diving into RAG architecture (semantic search, embeddings, and vector stores); simple demo for embeddings; the “naive” RAG architecture; basic RAG setup; the interplay between retrieval mechanisms and generative models; common issues with naive RAG; implementation challenges and their impact on overall LLM application performance
  • Hands-on exercise: Create a simple RAG application using Python
  • Q&A

Break

From PoC to production—enhancing RAG responses (105 minutes)

  • Presentation: Techniques for improving naive RAG performance; optimization strategies for better retrieval (reranking, hybrid search, and metadata filters); refining RAG implementation with advanced techniques; AI agents and orchestrating complex workflows
  • Hands-on exercises: Refine the RAG implementation with advanced techniques; evaluate and showcase the improved performance
  • Group discussion: How can we further improve the performance and reliability of our application?
  • Q&A

Break

Building trust in LLM applications (75 minutes)

  • Presentation: Difference from traditional software development; addressing the black box nature of LLMs; necessity of building trust in AI systems; identifying parameters and factors that can affect trustworthiness; evaluation/testing strategy (component-based and end-to-end evaluations); best practices for evaluations; observability for LLM applications; key metrics to measure performance; best practices for continuous monitoring of evaluations, LLM performance, and application performance and behavior; security (jailbreaks, prompt injection); building security guardrails
  • Q&A

Your Instructor

  • Sarang Sanjay Kulkarni

    Sarang Sanjay Kulkarni leads a healthcare client account at Thoughtworks, spearheading projects using generative AI to develop advanced research assistants designed to expedite drug discovery timelines. He has over 13 years of experience in application development across various domains such as banking, retail, healthcare, and astronomy. His experience spans the entire development stack, equipping him with expertise across multiple roles, including developer, DevOps engineer, data engineer, and AI engineer. Sarang also serves as a trainer and has facilitated numerous developer bootcamps, data engineering programs, and generative AI training, sharing his knowledge and guiding teams through complex projects.

    linkedinXlinksearch

Skill covered

Retrieval Augmented Generation (RAG)