book

Distributed Systems Observability

Name: Distributed Systems Observability
Author: Cindy Sridharan
ISBN: 9781492033424

by Cindy Sridharan

July 2018

Intermediate to advanced

34 pages

46m

English

O'Reilly Media, Inc.

Read now

Unlock full access

1. The Need for Observability
What Is Observability?Observability Isn’t Purely an Operational ConcernConclusion
2. Monitoring and Observability
Alerting Based on Monitoring DataBest Practices for AlertingWhat Monitoring Signals to Use for Alerting?Debugging “Unmonitorable” FailuresObservability Isn’t a PanaceaConclusion
3. Coding and Testing for Observability
Coding for FailureOperational Semantics of the ApplicationOperational Characteristics of the DependenciesDebuggable CodeTesting for FailureConclusion
4. The Three Pillars of Observability
Event LogsThe Pros and Cons of LogsLogging as a Stream Processing ProblemMetricsThe Anatomy of a Modern MetricAdvantages of Metrics over Event LogsThe Drawbacks of MetricsTracingThe Challenges of TracingService Meshes: A New Hope for the Future?Conclusion
5. Conclusion

Content preview from Distributed Systems Observability

Chapter 5. Conclusion

As my friend Brian Knox, who manages the Observability team at DigitalOcean, said,

The goal of an Observability team is not to collect logs, metrics, or traces. It is to build a culture of engineering based on facts and feedback, and then spread that culture within the broader organization.

The same can be said about observability itself, in that it’s not about logs, metrics, or traces, but about being data driven during debugging and using the feedback to iterate on and improve the product.

The value of the observability of a system primarily stems from the business and organizational value derived from it. Being able to debug and diagnose production issues quickly not only makes for a great end-user experience, but also paves the way toward the humane and sustainable operability of a service, including the on-call experience. A sustainable on-call is possible only if the engineers building the system place primacy on designing reliability into a system. Reliability isn’t birthed in an on-call shift.

For many, if not most, businesses, having a good alerting strategy and time-series based “monitoring” is probably all that’s required to be able to deliver on these goals. For others, being able to debug needle-in-a-haystack types of problems might be what’s needed to generate the most value.

Observability, as such, isn’t an absolute. Pick your own observability target based on the requirements of your services.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781492033431

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Distributed Systems Observability

by Cindy Sridharan

Chapter 5. Conclusion

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.