Chapter 7. The Journey into SCE

By simply “responding” to “incidents,” the security industry overlooks valuable chances to further understand and nurture those incidents as opportunities to proactively strengthen system resilience. What if instead we proactively and purposefully initiated incidents expressly to learn their impacts and design graceful automated responses?

Validate Known Assumptions

If the majority of malicious code is designed to prey on the unsuspecting, ill-prepared, or unknowing practitioner, aka “the low-hanging fruit,” then it makes sense to ask the following questions: How many attacks would still be successful if there wasn’t such a large surface area to begin with? Could it be that this “low-hanging fruit” is the key to proactively understanding how our systems—and the humans that build and operate them—behave?

If we always expected humans and systems to behave in unintended ways, perhaps we would act differently and have more useful views regarding system behavior: assume failure, and design the system to expect failure and handle them gracefully.

We should focus on learning from the failures that happen in our systems. Through this shift in thinking, we can begin to understand what it takes to build more resilient systems. By building resilient systems based on learning from experimental testing, we make unsophisticated criminals and attackers work harder for their money.

Crafting Security Chaos Experiments

SCE introduces observability through ...

Get Security Chaos Engineering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.