Generative AI in Production

Intermediate

How to navigate the complexities of deploying and optimizing LLMs in production

Course Outcomes:

Learn how to make critical LLM framework decisions
Understand how to evaluate LLMs
Learn various options for deploying and monitoring LLMs in production

Large language models are fundamentally transforming how AI is integrated into real-world applications. Turning these models into production-ready systems involves far more complexity. Today's most impactful systems go beyond text-only interactions—they leverage multimodal inputs and outputs (text, images, speech) and are increasingly agentic, capable of reasoning, using tools, managing memory, and orchestrating multi-step workflows. In this hands-on course, expert Skanda Vivek guides you through the key concepts needed to build robust, scalable GenAI systems. You’ll learn how to design and deploy LLM pipelines using prompt engineering, retrieval-augmented generation (RAG), and tool use. You'll also explore how to evaluate and monitor these systems in production—ensuring high quality, reliability, and safety.

Whether you're a developer, ML engineer, or product builder, this course will equip you with the knowledge to harness the full potential of modern LLMs, multimodal models, and agentic AI.

What you’ll learn and how you can apply it

Make critical LLM framework decisions
Evaluate LLMs
Deploy and monitor LLMs in production

This live event is for you because...

You’re a developer who’s integrating LLMs into a product.
You’re a CTO/CDO who wants to integrate LLMs into your business.
You're a software engineer who wants to learn more about LLMs to upskill or apply for ML engineer jobs.

Prerequisites

Familiarity with fundamental machine learning concepts (classification and regression, model training and testing, loss functions, backpropagation, etc.)
Familiarity with software development in Python
Familiarity with ChatGPT

Recommended preparation:

Access to ChatGPT (optional for following along with the exercises)
Read Hands-On Large Language Models (book)
Read Prompt Engineering for Generative AI (book)

Recommended follow-up:

Read LLMOps (book)
Take MLOps/LLMOps Bootcamp (live course)

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Build a production ready GenAI application (60 minutes)

Presentation: Designing GenAI Apps
Question: “What about this course attracts you? (tutorials, deployment, optimization/evals) (open-ended, chat)”
Presentation: Customizing LLMs for business and enterprise use-cases
Hands-On Exercise: Build a production ready RAG application (includes prototyping, evaluating, and production code)
Presentation: AI Agents
Group discussion: When do you choose RAG versus fine-tuning?
Q&A
Break

Optimizing GenAI Applications (45 minutes)

Presentation: LLM as a Judge for evals
Presentation: Automatic Prompt Optimization with DSPy
Presentation: Model quantization for optimizing resources
Hands-on exercise: Locally running a quantized model
Group discussion: How is LLM eval different from traditional ML eval?; What are the biggest concerns for LLM performance?; Can we completely remove hallucinations?
Q&A
Break

Scaling and Deployment (75 minutes)

Discussion: ”What is usually hardest for you in deploying an application?”
Presentation: Databases and integration
Hands-on Exercise: Deploy an open-source LLM API using HuggingFace + AWS
Presentation: Monitoring and making improvements
Presentation: Agentic protocols MCP, A2A
Group discussion: Challenges for bringing GenAI app into production
Q&A

Your Instructor

Skanda Vivek
Skanda Vivek is a senior data scientist at Intuit, working on generative AI. Previously, he was a senior data scientist on the risk intelligence team at OnSolve, where he developed advanced AI-based algorithms for rapidly detecting critical emergencies through big data. He has also been an assistant professor and a postdoctoral fellow at Georgia Tech. His work has been published in multiple scientific journals as well as broadcast widely by outlets such as the BBC and Forbes. Skanda is passionate about sharing knowledge and teaches data- and AI-focused courses with O’Reilly. He received his PhD in physics from Emory University.

linkedin link search

Skill covered

Large Language Models (LLMs)

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills