Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble Learning

Software fault prediction (SFP) is a crucial aspect of software engineering, aiding in the early identification of potential defects. This proactive approach significantly contributes to enhancing software quality and reliability. However, a common challenge in SFP is class imbalance (CI). Ensemble...

Full description

Saved in:
Bibliographic Details
Main Authors: Hanan Sharif Alsorory, Mohammad Alshraideh
Format: Article
Language:English
Published: Wiley 2024-01-01
Series:Applied Computational Intelligence and Soft Computing
Online Access:http://dx.doi.org/10.1155/2024/2959582
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Software fault prediction (SFP) is a crucial aspect of software engineering, aiding in the early identification of potential defects. This proactive approach significantly contributes to enhancing software quality and reliability. However, a common challenge in SFP is class imbalance (CI). Ensemble learning (EL) is a powerful strategy for refining SFP models in object-oriented systems with imbalanced data and improving sensitivity to minority classes. This study aimed to improve the effectiveness of ensemble classes in SFP within object-oriented systems, tackling the challenges associated with imbalanced data. It focuses on enhancing the performance of three ensemble classifiers, BalancedBagging, RUSBoost, and EasyEnsemble, explicitly designed for imbalanced datasets. In Enhanced_BalancedBagging (E_BB) and ROSBoost, random undersampling (RUS) is substituted with random oversampling (ROS). Meanwhile, Enhanced_EasyEnsemble (E_EE) replaces RUS with ROS and AdaBoost with XGBoost. The experimental results demonstrate the superior performance of E_BB, ROSBoost, and E_EE over their base models, achieving the highest F-measure, balanced accuracy, and AUC. Statistical tests, such as the Wilcoxon signed-rank test, provide robust support for the enhanced models, highlighting their practical significance through substantial improvements in F-measure and AUC, as indicated by low negative rank sums and large effect sizes.
ISSN:1687-9732