Skip to Content
Patterns for Fault Tolerant Software
book

Patterns for Fault Tolerant Software

by Robert S. Hanmer
December 2007
Intermediate to advanced
309 pages
7h 20m
English
Wiley
Content preview from Patterns for Fault Tolerant Software

Chapter 5. Detection Patterns

The first phase of fault tolerance is detection. Faults and the errors that they cause must be detected. Detection must occur before any recovery or mitigating actions can be taken to tolerate their presence in the system. Waiting and letting unknown latent faults activate and cause that result in failures is not fault tolerant.

The patterns in this chapter help detect the presence of errors or failures and the faults that caused them. They provide a number of mechanisms to monitor the system and to detect if it is behaving erroneously. Two pairs of concepts drive detection at execution time. These are errors versus failures and a priori knowledge versus comparison of redundant elements, see Figure 27.

A priori detection uses constraints that are known in advance about the system to determine if some deviation from the normal situation of correctness exists. The range of results to be considered includes system states, results, and any side effects. If nothing is known about the range of results this method will obviously not work.

Much of the fault tolerant programming literature has focused on the second method, that of comparing redundant results. Redundancy (3) in Chapter 4 discussed introducing redundant elements into the system. Many purpose-built systems with custom hardware use redundant hardware to execute the same program and the results. The comparison might be done in real time by hardware matchers that look at internal, partial computation ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

The Three Traps That Stymie Reinvention

The Three Traps That Stymie Reinvention

Ryan Raffaelli
Fault-Tolerant Systems

Fault-Tolerant Systems

Israel Koren, C. Mani Krishna
Coaching for High Performance

Coaching for High Performance

MIT Sloan Management Review

Publisher Resources

ISBN: 9780470319796Purchase book