Chapter 7. LinkedIn Being Mindful of Members

Whenever you run a chaos experiment in production, you have the potential to impact users of your product. Without our loyal users, we wouldn’t have systems to maintain, so we must put them first while carefully planning our experiments. While some minor impact may be inevitable, it’s very important that you minimize the blast radius of a chaos experiment and have a simple recovery plan that can get everything back to normal. In fact, minimizing the blast radius is one of the advanced principles of Chaos Engineering (see Chapter 3). In this chapter, you will learn best practices for adhering to this principle along with a story of how it was implemented within the software industry.

To put this theme in context, let’s briefly shift gears to the automotive industry. All modern vehicles undergo rigorous crash testing by manufacturers, third parties, and governments to vet the safety of passengers when they get into accidents. To perform these tests, engineers leverage crash test dummies that simulate the human body and have several sensors to help determine how a crash would impact an actual human.

Automotive crash test dummies have evolved significantly over the past decades. In 2018, the NHTSA came out with Thor, which has been called the most lifelike crash test dummy ever developed. With around 140 data channels, Thor gives engineers rich data on how accidents would impact real humans, and dummies like it are able to give ...

Get Chaos Engineering now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.