Debugging by Thinking

I}.4 Reduce the required input 207

8.4

How does creating a test case suggest hypotheses? It focuses your atten-

tion on just those parts of the program that are actually causing problems.

If a test program doesn't include a particular part feature of the application,

there is no point in forming a hypothesis that the handling of that feature is

the cause of the defect. By the time you have cut down a test case program

to the minimum size required to manifest the defect, you will have elimi-

nated a whole host of potential hypotheses from further consideration.

Creating a standalone test case is one of the first things we do when

diagnosing a bug. If done correctly, it can provide you with additional

hypotheses and refined hypotheses from stabilization activities.

Reduce the required input

The best test case is the one in which all input elements have been removed

that have no bearing on whether the undesired behavior occurs. If the defec-

tive program fails when processing a large input data set, it's very important

to reduce that data set before attempting to diagnose the problem.

If the input is a homogeneous aggregate of values, such as a matrix of

floating-point numbers, you can try a matrix that just has the first and last

rows of the matrix, or the first and last columns. If this doesn't cause the

problem to manifest, keep adding rows (or columns) from the original

input back in, until it does manifest itself. You can also start with the upper

left corner of the array and simultaneously add a row and a column.

If the input is a collection of heterogeneous aggregates, the techniques

required to cut down the input are a bit more complicated. If the input file

is a collection of independent records, you can take a random selection of

10 percent of the records. If the problem still manifests itself, take a random

selection of 10 percent of the remaining records, and repeat the process

until the problem no longer manifests itself. As an alternative, you can try

cutting the input set in half, and if the problem persists, continue cutting it

in half until the problem no longer manifests itself.

If the reported problem was related to a particular key or combination

of key values in the records, try selecting those records that have the prob-

lematic key or keys. If that selection still manifests the problem, use the ran-

dom 10 percent method or the binary selection method to cut down the

input set until the problem no longer manifests itself.

Another approach to cutting down a test input data set is warranted if

the input has a complex structure, such as an application program that is

I Chapter 8

Get Debugging by Thinking now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Debugging by Thinking by Robert Charles Metzger