Agentic AI Systems in Production
Published by Pearson
Deploy, secure, and operate production-grade LLMs and agentic systems
- Deploy and operate LLMs and agents reliably in production environments.
- Deploy scalable, production-ready RAG architectures.
- Secure and observe agentic AI systems with enterprise-grade practices.
In this 4-hour live course, you’ll learn how teams deploy, scale, and operate LLM-powered applications and agentic systems in real-world production environments. The course covers the full operational lifecycle, from running models locally and in the cloud, to deploying robust RAG architectures that can scale reliably under real workloads.
The focus is on production realities rather than experimentation. You’ll explore how retrieval pipelines, vector stores, and inference services are designed, optimized, and maintained, and how cost, latency, reliability, and scale influence architectural decisions when moving beyond prototypes.
You’ll also gain practical insight into the security and observability practices required to operate agentic AI systems safely. Through concrete examples, you’ll see how teams defend against prompt-level threats, apply guardrails and access controls, monitor agent behavior, and safely deploy updates without disrupting production systems.
What you’ll learn and how you can apply it
- Deploy and manage LLMs locally and in the cloud, using tools such as Ollama, vLLM, and managed services like AWS Bedrock.
- Design and operate production-ready RAG systems, including ingestion pipelines, vector databases, and optimized retrieval workflows.
- Apply security controls and guardrails to mitigate prompt injection, tool misuse, and other risks specific to agentic AI systems.
- Monitor and operate live AI systems, using observability techniques to track performance, cost, reliability, and agent behavior over time.
This live event is for you because...
This live event is for you because you have moved beyond toying with LLMs and now need to deploy them in production. If you are responsible for reliability, cost, security, and governance of LLM or agent-based applications, this course provides the practical knowledge required to run them with confidence.
This course is designed for people who need practical guidance and insights into the deployment and operation of LLMs, AI applications, and agents.
Prerequisites
- Basic understanding of Generative AI systems, such as ChatGPT
- Coding experience with Python is beneficial, but not required
- Knowledge of cloud environments like AWS
Course Set-up
- No specific setup required
- Course files available here
Recommended Preparation
- Attend: Mastering AI and ML Fundamentals with Rob Barton and Jerome Henry
- Attend: Gen Foundations, Fine-Tuning, RAG, and LLM Application Development with Rob Barton and Jerome Henry:
- Read: Demystifying Generative AI: A Practical and Intuitive Introduction by Robert Barton and Jerome Henry
Recommended Follow-up
- Watch: AI & ML Foundations by Robert Barton and Jerome Henry
- Attend: Build Your Own AI Lab with Omar Santos
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Segment 1: Welcome and Overview (10 min)
- Course objectives and learning outcomes
- Where deployment and operations fit in the LLM and agent lifecycle
- Common challenges when moving from prototype to production
- What you will be able to deploy, secure, and operate by the end of the course
Segment 2: Local Model Management and Inference Serving (40 min)
- Why run models locally: cost, latency, control, and privacy considerations
- Managing models with Ollama: setup, versioning, and lifecycle management
- Exposing local models as inference endpoints
- Comparing Ollama and vLLM: operational trade-offs and use cases
- When local, hybrid, or cloud-first deployments make sense
Q&A (5 minutes)
Break (5 min)
Segment 3: Agent Frameworks in Production Environments (25 min)
- What changes when agents move from development to production
- Understanding CrewAI, LangChain, and others
- Operational implications of agent-based architectures
- Scaling and operating agent-based systems reliably
Segment 4: Cloud-Based Deployment of LLMs with AWS Bedrock (25 min)
- Cloud deployment models for LLM systems
- Overview of managed LLM platforms and when to use them
- AWS Bedrock architecture, model offerings, and configuration
- Integrating Bedrock into existing application stacks
- Cost, latency, and governance considerations for cloud-based LLMs
Q&A (5 minutes)
Break (5 min)
Segment 5: Deploying Production RAG Systems in Production (50 min)
- RAG as a production architecture pattern
- Data ingestion, chunking, and embedding pipelines at scale
- Vector store selection and deployment considerations
- Hybrid retrieval approaches: dense, sparse, and reranking
- Query orchestration and response synthesis
- Evaluating RAG quality, relevance, and cost
- Scaling RAG systems under real-world load
Q&A (5 minutes)
Break (5 min)
Segment 6: Securing Agentic AI Systems (25 min)
- Threat models specific to LLM and agent-based systems
- Prompt injection, tool misuse, and indirect prompt attacks
- Guardrails, policy enforcement, and runtime controls
- Data privacy, access control, and secrets management
- Audit logging, governance, and compliance considerations
Segment 7: Monitoring and Observability of AI Systems in Production (20 min)
- Why traditional monitoring is insufficient for LLMs and agents
- Core operational metrics: latency, cost, throughput, and reliability
- Observing model behavior and agent decision paths
- Using LangSmith to monitor the Agentic AI system (with a live demo)
Q&A (5 minutes)
Course wrap-up and next steps (5 minutes)
Your Instructors
Rob Barton
Rob Barton is a Distinguished Engineer with Cisco. Rob has worked in the IT industry for over 27 years, the last 25 of which have been with Cisco. Rob Graduated from the University of British Columbia with a degree in Engineering Physics. Rob is a published author, with titles on subjects of Generative AI, Quality of Service (QoS), Wireless Communications, and IoT. Additionally, he has co-authored many peer-reviewed research papers and leads Cisco’s academic research partnership program. Rob holds numerous patents in the areas of AI, wireless communications, network security, cloud networking, and IoT. His current areas of work include network automation and Agentic models for IT management.
Jerome Henry
Jerome Henry is a Distinguished Engineer in the Office of the Wireless CTO at Cisco Systems. His main field of research is around optimization of performances in unlicensed wireless networks, which includes aspects of QoS, IoT, privacy, indoor location, but also AI/Machine Learning and LLMs centered on network languages. Jerome has more than 25 years of experience teaching technical courses in more than 15 different countries and 4 different languages, to audiences ranging from graduate degree students to networking professionals and technical support engineers. Jerome joined Cisco in 2012. Before that time, he was consulting and teaching heterogeneous networks and wireless integration with the European Airespace team, which was later acquired by Cisco to become their main wireless solution.