Deploying AI Systems Safely
Published by O'Reilly Media, Inc.
Evaluation, versioning, and rollout strategies
What you’ll learn and how you can apply it
- Design an AI release workflow with evaluation gates for prompts, models, and tool changes
- Build a versioning strategy for prompts, configurations, test sets, and release artifacts
- Create a rollout plan that includes canary releases, monitoring, and rollback criteria
- Diagnose common post-release failures using traces, metrics, and release documentation
Course description
AI teams are getting better at building prototypes, but many still struggle when it comes to safely shipping changes to production. Unlike traditional software, AI systems can change behavior when prompts are updated, retrieval settings are adjusted, tools are added, or model versions are swapped. That makes release decisions trickier. Teams need a practical way to evaluate changes, version the right artifacts, and roll out updates without introducing silent regressions.
Machine learning engineer Apurva Misra shares operational strategies that will help you deploy AI systems safely. You’ll work through a practical framework for release readiness, including evaluation gates, prompt and configuration versioning, regression checks, staged rollouts, monitoring, and rollback planning. Through guided exercises, you’ll review sample release scenarios, identify deployment risks, design safer rollout plans, and build lightweight processes that you can apply to LLM applications and agentic systems on the job.
This live event is for you because...
- You’re a software, ML, AI, or platform engineer working on LLM applications or agents.
- You work with prompts, retrieval pipelines, tools, workflow orchestration, or release processes for AI systems.
- You want to move from ad hoc launches to a repeatable, safer deployment process for AI features.
Prerequisites
- A computer with a browser and access to a text editor
- A working understanding of LLM applications or agentic systems
- Familiarity with concepts such as prompts, model calls, and retrieval
- Basic experience with software delivery and release processes
- Comfort reading simple technical artifacts such as configs, test cases, and dashboards
Recommended preparation:
- Download the course worksheets and templates (link to come)
- Review a short primer on common AI release failure modes (link to come)
Recommended follow-up:
- Take MLOps/LLMOps Bootcamp (live online course with Ammar Mohanna)
- Read LLMOps (book)
- Read LLMs in Production (book)
- Explore Production LLM Monitoring (on-demand course)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Why AI deployments fail after the demo (60 minutes)
- Presentation: What makes AI releases different from standard software releases; release risks across prompts, models, retrieval, and tools
- Group discussion: Where do releases currently break in your team?
- Hands-on exercise: Identify failure points in a sample AI release workflow
- Q&A
- Break
Evaluation gates and versioning (55 minutes)
- Presentation: Designing practical evaluation gates before release
- Demonstration: What to version in an AI system: prompts, configs, eval sets, models, and tool definitions
- Hands-on exercise: Build a release checklist and versioning plan for a sample system
- Q&A
- Break
Safe rollouts, monitoring, and rollback planning (55 minutes)
- Presentation: Safe rollout strategies for AI systems, including canary releases and phased exposure
- Group discussion: What should trigger a rollback?
- Hands-on exercise: Create monitoring and rollback criteria for a production change
- Q&A
Wrap-up and Q&A (10 minutes)
Your Instructor
Apurva Misra
Apurva Misra is a machine learning engineer, AI consultant, speaker, and founder of Sentick. She works with teams building practical AI systems, with a focus on moving from prototypes to reliable production workflows. Her work includes agentic systems, evaluation, context design, and the operational patterns required to make AI systems usable in real organizations.