Let It Crash

Sometimes the best thing you can do to create system-level stability is to abandon component-level stability. In the Erlang world, this is called the “let it crash” philosophy. We know from Chapter 2, Case Study: The Exception That Grounded an Airline, that there is no hope of preventing every possible error. Dimensions proliferate and the state space exponentiates. There’s just no way to test everything or predict all the ways a system can break. We must assume that errors will happen.

The key question is, “What do we do with the error?” Most of the time, we try to recover from it. That means getting the system back into a known good state using things like exception handlers to fix the execution stack and try-finally blocks ...

Get Release It!, 2nd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.