Chapter 7. Principled failure handling

You have seen that resilience requires distributing and compartmentalizing systems. Distribution is the only way to avoid being knocked out by a single failure, be that hardware, software, or human; and compartmentalization isolates the distributed units from each other such that the failure of one of them does not spread to the others. The conclusion was that in order to restore proper function after a failure, you need to delegate the responsibility of reacting to this event to a supervisor.

The importance of ownership appeared already within the decomposition of a system according to divide et regna, expressed as the difference between a descendant module and a dependency. Descendants own a piece of ...

Get Reactive Design Patterns now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.