
385Mining Unstructured Data
10.4.5 Discussion of Results
Although we have used few words to predict models, the AUC in many cases is high (see
Tables10.13 and 10.15). The AUC values of models predicted using the PITS-A data set
are very high. The best results obtained for predicting faults at various severity levels are
shown below:
• For severity=high, average AUC=0.824–0.943
• For severity=medium, average AUC=0.727–0.846
TABLE 10.16
Results for Top-50 Words Corresponding to Low and Very Low
Severity Faults
Low Severity Defects Very Low Severity Defects
Runs AUC Sensitivity Cutoff AUC Sensitivity Cutoff
1 0.801 0.692 0.224 0.754 0.667