Chapter 3

Developing Reliable Software

Samuel Keene

Introduction and Background

In this chapter we discuss software reliability, how it can be measured, and how it can be improved. Software is a script written by a programmer dictating how a system will behave under foreseen operating conditions. It typically performs its mainline function flawlessly. This part of the software has been tested extensively by the time it is shipped. Typically, problems arise for software in handling program exceptions or branching conditions when these conditions were not accounted for properly in the original design. Usually, the programmer is hurrying to implement the mainline function of the software, making it difficult to consider and deal properly with operational contingencies. The reliability analyst can help expose these conditions to the developer and get them handled properly.

Software is the embodiment of a programmer's thought processes. It faithfully executes its script. This can cause problems when the program faces circumstances that were not considered properly by the author of the code. For example, a planetary probe satellite was programmed to switch over to its backup power system if it did not receive any communications from the ground for a 7-day period. It was assumed that such lack of communication would signal a disruption in the satellite communications power system. The built-in maintenance strategy was to switch over to the backup power supply. This switching algorithm ...

Get Design for Reliability now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.