Chapter 1. Chaos Engineering Distilled
Want your system to be able to deal with the knocks and shakes of life in production? Want to find out where the weaknesses are in your infrastructure, platforms, applications, and even people, policies, practices, and playbooks before you’re in the middle of a full-scale outage? Want to adopt a practice where you proactively explore weaknesses in your system before your users complain? Welcome to chaos engineering.
Chaos engineering is an exciting discipline whose goal is to surface evidence of weaknesses in a system before those weaknesses become critical issues. Through tests, you experiment with your system to gain useful insights into how your system will respond to the types of turbulent conditions that happen in production.
This chapter takes you on a tour of what chaos engineering is, and what it isn’t, to get you in the right mind-set to use the techniques and tools that are the main feature of the rest of the book.
Chaos Engineering Defined
According to the Principles of Chaos Engineering:
Chaos Engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production.
Users of a system want it to be reliable. Many factors can affect reliability (see “Locations of Dark Debt”), and as chaos engineers we are able to focus on establishing evidence of how resilient our systems are in the face of these unexpected, but inevitable, conditions.
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access