Chapter 2Case Study: The Exception That Grounded an Airline

Have you ever noticed that the incidents that blow up into the biggest issues start with something very small? A tiny programming error starts the snowball rolling downhill. As it gains momentum, the scale of the problem keeps getting bigger and bigger. A major airline experienced just such an incident. It eventually stranded thousands of passengers and cost the company hundreds of thousands of dollars. Here’s how it happened.

As always, all names, places, and dates have been changed to protect the confidentiality of the people and companies involved.

It started with a planned failover on the database cluster that served the core facilities (CF). The airline was moving toward a ...

Get Release It!, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.