The DevOps Handbook, 2nd Edition
by Gene Kim, Jez Humble, Patrick Debois, John Willis, Nicole Forsgren
14
CREATE TELEMETRY TO ENABLE SEEING AND SOLVING PROBLEMS
A fact of life in Operations is that things go wrong—small changes may result in many unexpected outcomes, including outages and global failures that impact all our customers. This is the reality of operating complex systems; no single person can see the whole system and understand how all the pieces fit together.
When production outages and other problems occur in our daily work, we often don’t have the information we need to solve the problem. For example, during an outage we may not be able to determine whether the issue is due to a failure in our application (e.g., defect in the code), in our environment (e.g., a networking problem, server configuration problem), or something entirely ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access