The successful predictive ability of our standard negative binomial regression model, and the interest among practitioners to be able to use our technology, led us to build an automated programming environment tool that gives users the ability to generate fault predictions for the files in a release that is about to enter the system testing phase.
A prototype of this tool is now operational, and requires minimal user expertise and input. The user identifies the particular release of the system under development for which predictions are wanted and the specific file types that should be included in the predictions. In addition, the user tells the tool how to identify entries in the MR database that represent past changes that were made to correct defects. Defects are typically identified either by the development stage at which the MR was written, such as system testing, or by the role of the person who initiated the MR, such as a system tester.
The tool returns its results as a sorted list of files, in decreasing order of the predicted number of faults. Users can indicate the percentage of the files for which they are interested in seeing results, and can optionally restrict the output to only certain file types. Thus a user might ask to see the worst 10% of the Java files or the predicted most faulty 20% of files written in C, C++, Java, or SQL. The results are generally produced very quickly.
The tool does not require the user to have any knowledge of data mining ...