Breaking Things to Make Them Better
According to the principles of chaos engineering,[93] chaos engineering is “the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production.” That means it’s empirical rather than formal. We don’t use models to understand what the system should do. We run experiments to learn what it does.
Chaos engineering deals with distributed systems, frequently large-scale systems. Staging or QA environments aren’t much of a guide to the large-scale behavior of systems in production. In Scaling Effects, we saw how different ratios of instances can cause qualitatively different behavior in production. That also applies to ...