16

Chaos Injector – Advanced Systems Stability

Nothing can survive for long without evolving. Site reliability engineers (SREs) are in an endless cycle of learning and applying as they become proficient in making systems resilient to incidents, downtimes, and outages. When systems are stable enough, SREs understand how to increase system trustworthiness by injecting chaos into it and checking weak links.

Two of the most unique and notorious practices inside the site reliability engineering domain are the wheel-of-misfortune game and chaos engineering; they can bring any system’s reliability to a new level. This chapter starts by explaining these two techniques and how they work towards increasing a system’s availability, resiliency, and reliability. ...

Get Becoming a Rockstar SRE now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.