Chapter 18

Self-Stabilization

18.1 Introduction

In this chapter we discuss a class of algorithms, called self-stabilizing algorithms, that can tolerate many kinds of “data” faults. A “datal” fault corresponds to change in value of one or more variable of a program because of some unforeseen error. For example, a spanning tree is usually implemented using parent variables. Every node in the computer network maintains the parent pointer that points to the parent in the spanning tree. What if one or more of the parent pointers get corrupted? Now, the parent pointers may not form a valid spanning tree. Such errors will be called “data” faults. We will assume that the code of the program does not get corrupted. Because the code of the program does not change with time, it can be periodically checked for correctness using a copy stored in the secondary storage.

We assume that the system states can be divided into legal and illegal states. The definition of the legal state is dependent on the application. Usually, system and algorithm designers are very careful about transitions from the legal states, but illegal states of the system are ignored. When a fault occurs, the system moves to an illegal state and if the system is not designed properly, it may continue to execute in illegal states. A system is called self-stabilizing if regardless of the initial state, the system is guaranteed to reach a legal state after a finite number of moves.

We will illustrate the concept of self-stabilizing ...

Get Concurrent and Distributed Computing in Java now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.