Skip to Content
Infrastructure & Ops Superstream: AI-Driven Operations and Observability
conference

Infrastructure & Ops Superstream: AI-Driven Operations and Observability

by Sam Newman, Niall Richard Murphy, Abi Aryan, Austin Parker, Aman Khan, Dylan Patel, Milly Leadley
August 2025
3h 20m
English
O'Reilly Media, Inc.
Superstream available
Closed Captioning available in German, English, Spanish, French, Italian, Japanese, Korean, Portuguese (Portugal, Brazil), Chinese (Simplified), Chinese (Traditional)

Overview

AI isn't just playing around the edges of IT; it's fundamentally reshaping how tech is built, run, and managed.

Host Sam Newman and industry experts share practical insights and actionable strategies to deal with the dual transformations triggered by AI: how it optimizes operations, and how infrastructure must evolve to truly harness its power. Our panel addresses AI-driven operations and observability (AIOps) to show how machine learning enhances traditional IT functions, including automating crucial tasks such as incident management and system performance monitoring. You’ll learn how AIOps empowers system reliability, platform, and DevOps teams to find root causes faster, reduce alert fatigue, and implement strong predictive maintenance. You'll also learn the infrastructure essential for AI itself, including the specialized systems needed to reliably power demanding AI and ML workloads, and explore how core principles of operating technology systems are vital for building and maintaining next-generation AI environments. These sessions will leave you better equipped to both optimize your tech operations with AI and confidently deploy it at scale, ultimately driving improved system reliability and efficiency across all your technology.

What you’ll learn and how you can apply it

  • Gain a comprehensive understanding of the evolving landscape of AIOps, including the challenges and opportunities for infrastructure and operations to effectively support and manage AI workloads
  • Learn practical methods for evaluating and improving LLM agents in AIOps using measurable, testable techniques
  • Master strategies for comprehensive observability across the entire LLM pipeline, from logging to error tracking, to build resilient AI systems
  • Discover advanced evaluation and monitoring stacks used by top AI teams to identify and prevent AI system failures

Recommended follow-up:

Please note that slides or supplemental materials are not available for download from this recording. Resources are only provided at the time of the live event.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Infrastructure & Ops Superstream: AI Infrastructure

Infrastructure & Ops Superstream: AI Infrastructure

Sam Newman, Abhishek Veeramalla, Colleen Tartow, John McBride, Ronald Petty, Adrián González Sánchez, Charity Majors
Infrastructure & Ops Superstream: Platform Engineering

Infrastructure & Ops Superstream: Platform Engineering

Sam Newman, Suhail Patel, Nicki Watt, Matthew Skelton, Uday Kiran Medisetty
Infrastructure & Ops Superstream: Platform Engineering Best Practices

Infrastructure & Ops Superstream: Platform Engineering Best Practices

Sam Newman, Adora Nwodo, Juliano Martins, Marcelo Quadros, Ama Asare, David Grizzanti, Ahmed Bebars, Chiradeep Vittal, Moo Olaniyan

Publisher Resources

ISBN: 0642572018264