Skip to Content
View all events

Generative AI in Production

Published by O'Reilly Media, Inc.

Intermediate content levelIntermediate

How to navigate the complexities of deploying and optimizing LLMs in production

Course Outcomes:

  • Learn how to make critical LLM framework decisions
  • Understand how to evaluate LLMs
  • Learn various options for deploying and monitoring LLMs in production

Large language models are fundamentally transforming how AI is integrated into real-world applications. Turning these models into production-ready systems involves far more complexity. Today's most impactful systems go beyond text-only interactions—they leverage multimodal inputs and outputs (text, images, speech) and are increasingly agentic, capable of reasoning, using tools, managing memory, and orchestrating multi-step workflows. In this hands-on course, expert Skanda Vivek guides you through the key concepts needed to build robust, scalable GenAI systems. You’ll learn how to design and deploy LLM pipelines using prompt engineering, retrieval-augmented generation (RAG), and tool use. You'll also explore how to evaluate and monitor these systems in production—ensuring high quality, reliability, and safety.

Whether you're a developer, ML engineer, or product builder, this course will equip you with the knowledge to harness the full potential of modern LLMs, multimodal models, and agentic AI.

What you’ll learn and how you can apply it

  • Make critical LLM framework decisions
  • Evaluate LLMs
  • Deploy and monitor LLMs in production

This live event is for you because...

  • You’re a developer who’s integrating LLMs into a product.
  • You’re a CTO/CDO who wants to integrate LLMs into your business.
  • You're a software engineer who wants to learn more about LLMs to upskill or apply for ML engineer jobs.

Prerequisites

  • Familiarity with fundamental machine learning concepts (classification and regression, model training and testing, loss functions, backpropagation, etc.)
  • Familiarity with software development in Python
  • Familiarity with ChatGPT

Recommended preparation:

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Build a production ready GenAI application (60 minutes)

  • Presentation: Designing GenAI Apps
  • Question: “What about this course attracts you? (tutorials, deployment, optimization/evals) (open-ended, chat)”
  • Presentation: Customizing LLMs for business and enterprise use-cases
  • Hands-On Exercise: Build a production ready RAG application (includes prototyping, evaluating, and production code)
  • Presentation: AI Agents
  • Group discussion: When do you choose RAG versus fine-tuning?
  • Q&A
  • Break

Optimizing GenAI Applications (45 minutes)

  • Presentation: LLM as a Judge for evals
  • Presentation: Automatic Prompt Optimization with DSPy
  • Presentation: Model quantization for optimizing resources
  • Hands-on exercise: Locally running a quantized model
  • Group discussion: How is LLM eval different from traditional ML eval?; What are the biggest concerns for LLM performance?; Can we completely remove hallucinations?
  • Q&A
  • Break

Scaling and Deployment (75 minutes)

  • Discussion: ”What is usually hardest for you in deploying an application?”
  • Presentation: Databases and integration
  • Hands-on Exercise: Deploy an open-source LLM API using HuggingFace + AWS
  • Presentation: Monitoring and making improvements
  • Presentation: Agentic protocols MCP, A2A
  • Group discussion: Challenges for bringing GenAI app into production
  • Q&A

Your Instructor

  • Skanda Vivek

    Skanda Vivek is a senior data scientist at Intuit, working on generative AI. Previously, he was a senior data scientist on the risk intelligence team at OnSolve, where he developed advanced AI-based algorithms for rapidly detecting critical emergencies through big data. He has also been an assistant professor and a postdoctoral fellow at Georgia Tech. His work has been published in multiple scientific journals as well as broadcast widely by outlets such as the BBC and Forbes. Skanda is passionate about sharing knowledge and teaches data- and AI-focused courses with O’Reilly. He received his PhD in physics from Emory University.

    linkedinXlinksearch

Skill covered

Large Language Models (LLMs)