Chaos Engineering

by Casey Rosenthal, Lorin Hochstein, Aaron Blohowiak, Nora Jones, Ali Basiri

Released August 2017

Publisher(s): O'Reilly Media, Inc.

ISBN: 9781491953068

Start your free trial

Book description

With so many interacting components, the number of things that can go wrong in a distributed system is enormous. You’ll never be able to prevent all possible failure modes, but you can identify many of the weaknesses in your system before they’re triggered by these events. This report introduces you to Chaos Engineering, a method of experimenting on infrastructure that lets you expose weaknesses before they become a real problem.

Members of the Netflix team that developed Chaos Engineering explain how to apply these principles to your own system. By introducing controlled experiments, you’ll learn how emergent behavior from component interactions can cause your system to drift into an unsafe, chaotic state.

Hypothesize about steady state by collecting data on the health of the system
Vary real-world events by turning off a server to simulate regional failures
Run your experiments as close to the production environment as possible
Ramp up your experiment by automating it to run continuously
Minimize the effects of your experiments to keep from blowing everything up
Learn the process for designing chaos engineering experiments
Use the Chaos Maturity Model to map the state of your chaos program, including realistic goals

Product information

Title: Chaos Engineering
Author(s): Casey Rosenthal, Lorin Hochstein, Aaron Blohowiak, Nora Jones, Ali Basiri
Release date: August 2017
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781491953068