While a necessary element of the solution, repositories such as PROMISE are only a part of the solution to finding motivating and convincing evidence. We described in the previous section how the gathering of evidence is always somewhat context-specific. Recently we have come to appreciate how much interpreting the evidence is also context-specific, even audience-specific. To appreciate that, we need to digress quickly to discuss a little theory.
Many software engineering problems live in a space of solutions that twists and turns like a blanket hastily thrown onto the floor. Imagine an ant searching the hills and valleys of this blanket, looking for the lowest point where, say, the software development effort and the number of defects are at a minimum.
If the problem is complex enough (and software design and software process decisions can be very complex indeed), there is no best way to find this best solution. Rather, our ant might get stuck in the wrong valley, thinking it the lowest when in fact it is not (e.g., if some ridge obscures its view of a lower neighboring valley).
Optimization and Artificial Intelligence (AI) algorithms use various heuristics to explore this space of options. One heuristic is to model the context of the problem, according to the goals of a particular audience, and then nudge the search in a particular direction. Imagine that the ant is on a leash and the leash is being gently pulled by the goal heuristic.
Now, here’s the kicker. These search heuristics, although useful, impose a bias on the search results. If you change the context and change the heuristic bias, these algorithms find different “best” solutions. For example, in experiments with AI searches over software process models, Green et al. used two different goal functions [Green et al. 2009]. One goal represented a safety-critical government project where we tried to reduce both the development effort and the number of defects in the delivered software. Another goal represented a more standard business situation where we were rushing software to market, all the while trying not to inject too many defects. That study examined four different projects using an AI optimizer. Each search was repeated for each goal. A striking feature of the results was that the recommendations generated using one goal were usually rejected by the other. For example, an AI search using one goal recommended increasing the time to delivery, whereas the other recommended decreasing it.
This result has major implications for any researcher trying to find evidence that convinces an audience to change the process or tools used to develop software. Rather than assume that all evidence will convince all audiences, we need to tune the evidence to the audience. It is not enough to stand on the stage and pull seemingly impressive evidence out of a hat. Even statistically strong, repeated evidence from elegant studies will not prompt any change if the evidence is about issues that are irrelevant to the audience. Put another way, we have to respect the business bias of the audience who may be asking themselves, “But what’s it all good for?”