Chapter 11: Diagnosing Problems

Finally, after instrumenting application code, configuring a collector to transmit the data, and setting up a backend to receive the telemetry, we have all the pieces in place to observe a system. But what does that mean? How can we detect abnormalities in a system with all these tools? That's what this chapter is all about. This chapter aims to look through the lens of an analyst and see what the shape of the data looks like as events occur in a system. To do this, we'll look at the following areas:

  • How leaning on chaos engineering can provide the framework for running experiments in a system
  • Common scenarios of issues that can arise in distributed systems
  • Tools that allow us to introduce failures into our system ...

Get Cloud-Native Observability with OpenTelemetry now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.