Chapter 6. Production Security in SCE

As we mentioned before, when we say “production security,” we’re referring to the security of software delivery beginning at the deployment phase and including the ongoing operation of software and services. You may also hear “production security” called “runtime security” or “operations security.” In this chapter, we’ll cover some of the core characteristics in infrastructure that create security by design, examples of failure in production systems, and how failure can inform both better infrastructure design as well as a strong detection and response program.

The deployment and operation of software in production is rapidly becoming essential to supporting revenue generation at organizations across a variety of industries. Failure can feel especially scary when it means an outage in a customer-facing service that directly and immediately translates into missed revenue. But, as always, failure still presents a valuable tool in our arsenal to ensure live, running systems can handle incidents gracefully and recover smoothly.

One of the only ways to proactively understand how your production systems respond to certain failure conditions is by conducting chaos engineering tests in production. Don’t fret: the authors are still anchored to reality! It’s very likely that your team or organization won’t feel comfortable starting its SCE program in production systems and will prefer development or staging environments be tested first instead. However, ...

Get Security Chaos Engineering now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.