Production LLM Monitoring: Observability, Tracing & Cost Optimization
with Paulo Dichone
Overview
In this 2-hour course, you will learn how to implement production-grade LLM observability, tracing, and cost optimization using Langfuse, enabling faster debugging, reliable monitoring, and tighter control over LLM API spending in real-world systems.
What I will be able to do after this course
- Implement production-grade LLM observability using Langfuse and tracing concepts
- Reduce LLM API costs using semantic caching, model routing, and prompt optimization
- Debug LLM applications quickly using traces, spans, and instrumentation patterns
- Set up cost alerts and monitoring dashboards to prevent budget escalations
- Build production-ready patterns for token tracking, cost calculation, and PII redaction
Course Instructor(s)
Paulo Dichone is a software engineer and educator who has taught 280,000+ students across 175 countries. He is the founder of Build Apps with Paulo and delivers practical, career-focused training in software development and cloud solutions. His teaching emphasizes real-world implementation that prepares learners for production challenges.
Who is it for?
This course is ideal for ML engineers, AI engineers, backend developers, and technical leads running LLM applications in production who need visibility into performance and costs. Learners should have basic Python skills, prior experience making LLM API calls, and a working Python setup with a code editor.
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Watch now
Unlock full access