Agentic AI Systems in Production

Published by Pearson

Intermediate

Deploy, secure, and operate production-grade LLMs and agentic systems

Deploy and operate LLMs and agents reliably in production environments.
Deploy scalable, production-ready RAG architectures.
Secure and observe agentic AI systems with enterprise-grade practices.

In this 4-hour live course, you’ll learn how teams deploy, scale, and operate LLM-powered applications and agentic systems in real-world production environments. The course covers the full operational lifecycle, from running models locally and in the cloud, to deploying robust RAG architectures that can scale reliably under real workloads.

The focus is on production realities rather than experimentation. You’ll explore how retrieval pipelines, vector stores, and inference services are designed, optimized, and maintained, and how cost, latency, reliability, and scale influence architectural decisions when moving beyond prototypes.

You’ll also gain practical insight into the security and observability practices required to operate agentic AI systems safely. Through concrete examples, you’ll see how teams defend against prompt-level threats, apply guardrails and access controls, monitor agent behavior, and safely deploy updates without disrupting production systems.

What you’ll learn and how you can apply it

Deploy and manage LLMs locally and in the cloud, using tools such as Ollama, vLLM, and managed services like AWS Bedrock.
Design and operate production-ready RAG systems, including ingestion pipelines, vector databases, and optimized retrieval workflows.
Apply security controls and guardrails to mitigate prompt injection, tool misuse, and other risks specific to agentic AI systems.
Monitor and operate live AI systems, using observability techniques to track performance, cost, reliability, and agent behavior over time.

This live event is for you because...

This live event is for you because you have moved beyond toying with LLMs and now need to deploy them in production. If you are responsible for reliability, cost, security, and governance of LLM or agent-based applications, this course provides the practical knowledge required to run them with confidence.

This course is designed for people who need practical guidance and insights into the deployment and operation of LLMs, AI applications, and agents.

Prerequisites

Basic understanding of Generative AI systems, such as ChatGPT
Coding experience with Python is beneficial, but not required
Knowledge of cloud environments like AWS

Course Set-up

No specific setup required
Course files available here

Recommended Preparation

Attend: Mastering AI and ML Fundamentals with Rob Barton and Jerome Henry
Attend: Gen Foundations, Fine-Tuning, RAG, and LLM Application Development with Rob Barton and Jerome Henry:
Read: Demystifying Generative AI: A Practical and Intuitive Introduction by Robert Barton and Jerome Henry

Recommended Follow-up

Watch: AI & ML Foundations by Robert Barton and Jerome Henry
Attend: Build Your Own AI Lab with Omar Santos

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Segment 1: Welcome and Overview (10 min)

Course objectives and learning outcomes
Where deployment and operations fit in the LLM and agent lifecycle
Common challenges when moving from prototype to production
What you will be able to deploy, secure, and operate by the end of the course

Segment 2: Local Model Management and Inference Serving (40 min)

Why run models locally: cost, latency, control, and privacy considerations
Managing models with Ollama: setup, versioning, and lifecycle management
Exposing local models as inference endpoints
Comparing Ollama and vLLM: operational trade-offs and use cases
When local, hybrid, or cloud-first deployments make sense

Q&A (5 minutes)

Break (5 min)

Segment 3: Agent Frameworks in Production Environments (25 min)

What changes when agents move from development to production
Understanding CrewAI, LangChain, and others
Operational implications of agent-based architectures
Scaling and operating agent-based systems reliably

Segment 4: Cloud-Based Deployment of LLMs with AWS Bedrock (25 min)

Cloud deployment models for LLM systems
Overview of managed LLM platforms and when to use them
AWS Bedrock architecture, model offerings, and configuration
Integrating Bedrock into existing application stacks
Cost, latency, and governance considerations for cloud-based LLMs

Q&A (5 minutes)

Break (5 min)

Segment 5: Deploying Production RAG Systems in Production (50 min)

RAG as a production architecture pattern
Data ingestion, chunking, and embedding pipelines at scale
Vector store selection and deployment considerations
Hybrid retrieval approaches: dense, sparse, and reranking
Query orchestration and response synthesis
Evaluating RAG quality, relevance, and cost
Scaling RAG systems under real-world load

Q&A (5 minutes)

Break (5 min)

Segment 6: Securing Agentic AI Systems (25 min)

Threat models specific to LLM and agent-based systems
Prompt injection, tool misuse, and indirect prompt attacks
Guardrails, policy enforcement, and runtime controls
Data privacy, access control, and secrets management
Audit logging, governance, and compliance considerations

Segment 7: Monitoring and Observability of AI Systems in Production (20 min)

Why traditional monitoring is insufficient for LLMs and agents
Core operational metrics: latency, cost, throughput, and reliability
Observing model behavior and agent decision paths
Using LangSmith to monitor the Agentic AI system (with a live demo)

Q&A (5 minutes)

Course wrap-up and next steps (5 minutes)

Your Instructors

Rob Barton
Rob Barton is a Distinguished Engineer with Cisco. Rob has worked in the IT industry for over 27 years, the last 25 of which have been with Cisco. Rob Graduated from the University of British Columbia with a degree in Engineering Physics. Rob is a published author, with titles on subjects of Generative AI, Quality of Service (QoS), Wireless Communications, and IoT. Additionally, he has co-authored many peer-reviewed research papers and leads Cisco’s academic research partnership program. Rob holds numerous patents in the areas of AI, wireless communications, network security, cloud networking, and IoT. His current areas of work include network automation and Agentic models for IT management.

search
Jerome Henry
Jerome Henry is a Distinguished Engineer in the Office of the Wireless CTO at Cisco Systems. His main field of research is around optimization of performances in unlicensed wireless networks, which includes aspects of QoS, IoT, privacy, indoor location, but also AI/Machine Learning and LLMs centered on network languages. Jerome has more than 25 years of experience teaching technical courses in more than 15 different countries and 4 different languages, to audiences ranging from graduate degree students to networking professionals and technical support engineers. Jerome joined Cisco in 2012. Before that time, he was consulting and teaching heterogeneous networks and wireless integration with the European Airespace team, which was later acquired by Cisco to become their main wireless solution.

search

Skill covered

Generative AI

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Agentic AI Systems in Production

What you’ll learn and how you can apply it

This live event is for you because...

Prerequisites

Schedule

Your Instructors

Rob Barton

Jerome Henry

Skill covered