In the Beginning
Decades ago, when asked what we thought was “beautiful evidence,” we would have laid out some combination of the following traits:
- Elegance of studies
Many studies in software engineering include human factors, because the effectiveness of many software technologies depends heavily on the people who are using them.[1] But dealing with human variability is a very challenging task. Studies with sophisticated designs that minimize these confounding aspects can be a source of admiration for other researchers. For example, a study by Basili and Selby [Basili and Selby 1987] used a fractional factorial design, in which every developer used every technique under examination, and every technique was used on every program snippet in the experiment.
- Statistical strength
As mathematical sophistication grows, so too does an emphasis on statistically significant results, so that researchers can be confident that their theory has some real-world effect that can be picked out from random background noise.
- Replicability of results
Results are far more convincing when they’re found again and again in many different contexts—i.e., not limited to one context or set of experimental conditions. In other sciences, replication builds confidence, and for this reason much effort has been expended to make software engineering experiments easy to rerun by other researchers in other contexts [Basili et al. 1999]. As an example of replicability, Turhan showed that software defect predictors learned ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access