Predicting Software Defects Using Bayesian Approaches

Predicting Software Defects Using Bayesian Approaches

Authors

Keywords:

defect anticipation in software, Bayesian Networks, categorization, machine learning

Abstract

In the realm of software engineering, the anticipation of software flaws holds significant importance as it enables developers to pinpoint and rectify issues before they escalate into expensive and challenging bugs. Timely identification of software defects not only economizes time and resources in the software development lifecycle but also assures the ultimate quality of the end product. This study seeks to assess three algorithms for constructing Bayesian Networks, aiming to classify projects as susceptible to defects. While Naive Bayes is the prevailing method in literature, this research introduces K2, Hill Climbing, and TAN as alternatives for constructing Bayesian Networks. Meanwhile, three publicly available PROMISE datasets are employed, incorporating McCabe and Halstead complexity metrics. The obtained results are benchmarked against widely used approaches like Decision Tree and Random Forest. Performance metrics applied in a cross-validation process reveal that the classification outcomes are on par with Decision Tree and Random Forest. Notably, Bayesian algorithms exhibit lower variability, enhancing the robustness of software engineering predictions. This advantage is evident in the consistent results of training and test data selection, distinguishing them from the variable outcomes observed in Decision Tree and Random Forest approaches.

References

Meiliana, S.K., Karim, S., Warnars, H.L.H.S., Gaol, F.L., Abdurachman, E., Soewito, B. Software Metrics for Fault Prediction Using Machine Learning Approaches: A Literature Review with PROMISE Repository Dataset. In Proceedings of the 2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom), Phuket, Thailand, 20–22 November 2017.

Hammanouri, A., Hammad, M., Alnabhan, M., Alsarayrah, F. Software Bug Prediction using Machine Learning Approach. Int. J. Adv. Comput. Sci. Appl. 2018, 9, 78–83.

Misirli, A., Bener, A.B. A Mapping Study on Bayeasian Networks for Software Quality Prediction. In Proceedings of the 3rd International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE), Hyderabad, India, 3 June 2014.

Herzing, K., Just, S., Zeller, A. It’s Not a Bug, It’s a Feature: How Misclassification Impacts Bug Prediction. In Proceedings of 2013 35th International Conference on Software Engineering (ICSE), San Francisco, CA, USA, 18–26 May 2013.

Hernández-Molinos, M.J., Sánchez-García, Á.J., Barrientos-Martínez, R.E. Classification Algorithms for Software Defect Prediction: A Systematic Literature Review. In Proceedings of the 2021 9th International Conference in Software Engineering Research and Innovation (CONISOFT), San Diego, CA, USA, 25–29 October 2021.

Li, R., Zhou, L., Zhang, S., Liu, H., Huang, X., Sun, Z. Software Defect Prediction Based on Ensemble Learning. In Proceedings of 2019 2nd International Conference on Data Science and Information Technology (DSIT), Seoul, Republic of Korea, 19–21 July 2019.

Aydin, Z.B.G., Samli, R. Performance Evaluation of Some Machine Learning Algorithms in NASA Defect Prediction Data Sets. In Proceedings of the 2020 5th International Conference on Computer Science and Engineering (UBMK), Diyarbakir, Turkey, 9–11 September 2020.

Goyal, S. Heterogeneous Stacked Ensemble Classifier for Software Defect Prediction. In Proceedings of the 2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC), Waknaghat, India, 6–8 November 2020.

Aljamaan, H., Alazba, A. Software Defect Prediction using Tree-Based Ensembles. In Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE), Online, 8–9 November 2020.

Ge, J., Liu, J., Liu, W. Comparative Study on Defect Prediction Algorithms of Supervised Learning Software Based on Imbalanced Classification Data Sets. In Proceedings of the 2018 19th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), Busan, Republic of Korea, 27–29 June 2018.

Prahba, C.L., Shivahumar, N. Software Defect Prediction Using Machine Learning Techniques. In Proceedings of the 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 15–17 June 2020.

Ahmed, M.R., Ali, M.A., Ahmed, N., Zamal, M.F.B., Shamrat, F.M.J.M. The Impact of Software Fault Prediction in Real-World Application: An Automated Approach for Software Engineering. In Proceedings of the 2020 the 6th International Conference on Computing and Data Engineering (ICCDE), Sanya, China, 4–6 January 2020.

Nehi, M.M., Fakhrpoor, Z., Moosavi, M.R. Defects in The Next Release; Software Defect Prediction Based on Source Code Versions. In Proceedings of the Iranian Conference on Electrical Engineering (ICEE), Mashhad, Iran, 8–10 May 2018

Zhou, Y., Shan, C., Sun, S., Wei, S., Zhang, S. Software Defect Prediction Model Based On KPCA-SVM. In Proceedings of the 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), Leicester, UK, 19–23 August 2019.

El-Shorbagy, S.A., El-Gammal, W.M., Abdelmoez, W.M. Using SMOTE and Heterogeneous Stacking in Ensemble learning for Software Defect Prediction. In Proceedings of the 7th International Conference on Software and Information Engineering (ICSIE), Cairo, Egypt, 2-4 May 2018.

Bhutamapuram, U.S.; Sadam, R. Within-project defect prediction using bootstrap aggregation based diverse ensemble learning

technique. J. King Saud Univ. Comput. Inf. Sci. [Year], [Volume], [Pages].

Goyal, S. Handling class-imbalance with KNN (neighbourhood) under-sampling for software defect prediction. Artif. Intell. Rev.

[Year], [Volume], [Pages].

Malhotra, R.; Meena, S. Defect prediction model using transfer learning. Soft Comput. [Year], [Volume], [Pages].

Goyal, S. Effective software defect prediction using support vector machines (SVMs). Int. J. Syst. Assur. Eng. Manag. [Year], [Volume],

[Pages].

Cornfield, J. Bayes Theorem. Rev. De L’institut Int. De Stat. [Year], [Volume], [Pages].

Madden, M.G. On the classification performance of TAN and general Bayesian networks. In Research and Development in Intelligent

Systems XXV. SGAI 2008; Springer: London, UK, [Year]; pp. 3–16.

Friedman, N.; Geiger, D.; Goldszmidt, M. Bayesian network classifiers. Mach. Learn. [Year], [Volume], [Pages].

Gámez, J.A.; Mateo, J.L.; Puerta, J.M. Learning Bayesian networks by hill climbing: Efficient methods based on progressive

restriction of the neighborhood. Data Min. Knowl. Discov. [Year], [Volume], [Pages].

Cooper, G.F.; Herskovits, E. A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. [Year], [Volume],

[Pages].

He, Y.L.; Zhao, W.J.; Xu, Y.; Zhu, Q.X. Research and Improvement of K2 Algorithm Based on Topological Sorting. In Proceedings

of the 2021 China Automation Congress (CAC), Beijing, China, [Year]; pp. 4623–4626.

Shirabad, J.S.; Menzies, T.J. The PROMISE Repository of Software Engineering Databases [Data Set]; School of Information Technology and Engineering, University of Ottawa: Ottawa, ON, Canada, [Year]; Available online: http://promise.site.uottawa.ca/

SERepository (accessed on 1 February 2022).

McCabe, T.J. A Complexity Measure. IEEE Trans. Softw. Eng. [Year], [Volume], [Pages].

Halstead, M.H. Elements of Software Science (Operating and Programming Systems Series) [Data Set]; Elsevier Science Inc.: Amsterdam,

The Netherlands, [Year].

Henry, S.; Selig, C. Predicting Source-Code Complexity at the Design Stage. IEEE Softw. [Year], [Volume], [Pages].

Fushiki, T. Estimation of Prediction Error by Using K-Fold Cross-Validation. Statics Comput. [Year], [Volume], [Pages].

Das, N.N.; Kumar, N.; Kaur, M.; Kumar, V.; Singh, D. Automated deep transfer learning-based approach for detection of

COVID-19 infection in chest X-rays. Irbm [2022], [43], [114-119].

Downloads

Published

2021-07-25

How to Cite

King, S. (2021). Predicting Software Defects Using Bayesian Approaches. Infotech Journal Scientific and Academic , 2(1), 53–82. Retrieved from https://infotechjournal.org/index.php/infotech/article/view/9

Issue

Section

Articles
Loading...