Resilient
Failures are inevitable. The hardware crashes, the software has defects, unexpected data is received, or an unexpected and poorly-tested execution path was taken—any of these events, or a combination of them, can happen at any time. Resilience is the ability of the system to withstand such a situation and continue to deliver the expected results.
It can be achieved using redundancy of the deployable components and hardware, using isolation of parts of the system from each other (so the domino effect becomes less probable), designing the system so that the lost piece can be replaced automatically or an appropriate alarm raised so that qualified personnel can interfere, and through other measures.
We have talked about distributed ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access