Automated Parameter Optimization of Classification Techniques for Defect Prediction Models

Authors: Chakkrit Tantithamthavorn Shane McIntosh Ahmed E. Hassan Kenichi Matsumoto

Venue: ICSE   2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp. 321-332, 2016

Year: 2016

Abstract: Defect prediction models are classifiers that are trained to identify defect-prone software modules. Such classifiers have configurable parameters that control their characteristics (e.g., the number of trees in a random forest classifier). Recent studies show that these classifiers may underperform due to the use of suboptimal default parameter settings. However, it is impractical to assess all of the possible settings in the parameter spaces. In this paper, we investigate the performance of defect prediction models where Caret - an automated parameter optimization technique - has been applied. Through a case study of 18 datasets from systems that span both proprietary and open source domains, we find that (1) Caret improves the AUC performance of defect prediction models by as much as 40 percentage points; (2) Caret-optimized classifiers are at least as stable as (with 35% of them being more stable than) classifiers that are trained using the default settings; and (3) Caret increases the likelihood of producing a top-performing classifier by as much as 83%. Hence, we conclude that parameter settings can indeed have a large impact on the performance of defect prediction models, suggesting that researchers should experiment with the parameters of the classification techniques. Since automated parameter optimization techniques like Caret yield substantially benefits in terms of performance improvement and stability, while incurring a manageable additional computational cost, they should be included in future defect prediction studies.

BibTeX:

@inproceedings{chakkrittantithamthavorn2016apooctfdpm,
    author = "Chakkrit Tantithamthavorn and Shane McIntosh and Ahmed E. Hassan and Kenichi Matsumoto",
    title = "Automated Parameter Optimization of Classification Techniques for Defect Prediction Models",
    year = "2016",
    pages = "321-332",
    booktitle = "Proceedings of 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)"
}

Plain Text:

Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E. Hassan, and Kenichi Matsumoto, "Automated Parameter Optimization of Classification Techniques for Defect Prediction Models," 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp. 321-332