6

–––––––––––––––––––––––

Diagnosability of Multiprocessor Systems

Chia-Wei Lee and Sun-Yuan Hsieh

6.1 INTRODUCTION

The rapid advances in very-large-scale integration (VLSI) technology and wafer-scale integration (WSI) technology have made it possible to design and produce a multiprocessor system containing hundreds or even thousands of processors (nodes) on a single chip. As the number of nodes in a multiprocessor system increases, node fault identification in such systems becomes more crucial for reliable computing. The process of discriminating between faulty nodes and fault-free nodes in a system is called fault diagnosis. When a faulty node is identified, it is replaced by a fault-free node to maintain the system’s reliability. The diagnosability of a system is the maximum number of faulty nodes that the system can identify.

Determining the diagnosability of multiprocessor systems based on various strategies and models has been the focus of a great deal of research in recent years (e.g., see References 1–25). Among the proposed models, two of which, namely, the PMC model (after Preparata, Metze, and Chien [19] and the MM model (after Maeng and Malek [18]), are well known and widely used. In the PMC model, every node is capable of testing whether another node v is faulty if there exists a communication link between them. The PMC model assumes that the tests of faulty nodes performed by fault-free ones always return one and that the tests performed by faulty nodes return ...

Get Scalable Computing and Communications: Theory and Practice now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Scalable Computing and Communications: Theory and Practice by Samee U. Khan, Albert Y. Zomaya, Lizhe Wang

6

Diagnosability of Multiprocessor Systems

6.1 INTRODUCTION

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly