208 8.5 Categorize the problem
being input to a faulty compiler. This approach considers the frequency of
use and misuse of the elements of the structure. The following elements
should be selected from the original input:
9 The elements that are least frequently used
9 The elements that are most frequently misused
9 The combinations and sequences that are least frequently used
9 The combinations and sequences that are most frequently misused
Opportunities for misusing input data elements arise out of complex
input structures. A program in a high-level programming language is an
input data set for a compiler. Examples of frequently misused input ele-
ments are goto statements, data overlays such C/C++ union, and Fortran
Yet another approach to cutting down a test input data set is to try to
reduce more than one aspect of the input at a time, keeping the others con-
stant. The aspects of a test data set include size, sequence, and values. The
size aspect includes both the number of items, as well as the number of
dimensions, in the case of arrays. The sequence aspect includes repeating
patterns and ascending or descending ordering. The values aspect includes
magnitude and the set of unique values represented.
How does reducing the required input suggest hypotheses? It focuses
your attention on just those parts of the input that are causing problems. If
a test data set doesn't include a particular feature, there is no point in form-
ing a hypothesis that the handling of that feature is the cause of the defect.
By the time you have cut down your test case input to the minimum size
required to manifest the defect, you will have eliminated a whole host of
potential hypotheses from further consideration.
Categorize the problem
8.5. I Correctness
Here is a list of questions useful in categorizing a problem with the correct-
ness of the output of a program:
9 Is there any output at all?
8.5 Categorize the problem 209
9 Is any output missing (deletion)?
9 Is there extra output (insertion)?
9 Are the individual output values correct (substitution)?
9 Are the individual output values in the correct order (transposition)?
9 Are the individual output values close to correct, but not acceptable?
What hypotheses do these questions suggest? Are complete elements
missing, or are individual elements (numbers, strings) only partially dis-
played? Are numerical values missing because they have been incorrectly
aggregated with others? Is an entire sequence of values missing from either
the beginning or the end of the output? Are individual random items miss-
ing, or are groups of items missing?
Are the extra values due to repetition of correct values? Is an entire
sequence of values inserted at either the beginning or the end of the output?
Are individual random items inserted, or are groups of adjacent items
Is the same value incorrectly substituted in many places, or are there dif-
ferent values substituted in each erroneous location? Are the values substi-
tuted a progression of related values?
Are the transposed values adjacent to their correct position, or are they
some distance from where they belong? Is the distance from the correct
position a constant for all values, or is it random? Are individual random
items out of position, or are groups of adjacent items out of position, but in
position with respect to each other?
Is the difference between the actual and expected values in the least sig-
nificant digits? Are the actual values off from the expected values by a ran-
dom or a constant amount? Are the actual values off from the expected
values by a constant factor or a progression of factors?
Identifying one of these situations doesn't guarantee that it's the cause of
correctness problems. If you find one, however, it's a good candidate for an
Here is a list of questions useful in categorizing a problem with the comple-
tion of a program:
I Chapter 8