Appendix A. Appendices

Patterns for Fault Tolerant Software Thumbnails

Table A.1. Patterns for fault tolerant software

Pattern

Pattern Intent

Acknowledgement (17)

Send a reply message to let a communicating party know that that the sender is alive.

Checkpoint (37)

Save state periodically so that it does not need to be regenerated from the beginning of execution

Checksum (25)

Add information to data or messages to verify that they are correct.

Complete Parameter Checking (14)

Check all the inputs and parameters rigorously to prevent bad results from causing errors during execution.

Concentrated Recovery (29)

The system should have as few distractions as possible during error recovery.

Correcting Audits

Design data to be checked and check data for errors. If errors are found, correct both the erroneous data and look for errors in related data.

Data Reset (41)

Restore some data to its initial (or a predetermined) value when it is found incorrect.

Deferrable work (43)

A system that is performing well in an overload situation does not need to be fixed by Routine Maintenance (22).

Equitable Resource Allocation (45)

Divide the resources up equitably between all the requestors.

Error Containment Barrier (13)

Isolate errors so that they do not spread.

Error Correcting Codes (57)

Add redundant information to data so that errors can be detected as in Checksum (25) and also automatically corrected.

Error Handler (30)

Provide a controlled manner for handling errors.

Escalation (9)

When error processing steps are not producing ...

Get Patterns for Fault Tolerant Software now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.