Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble Learning

Software fault prediction (SFP) is a crucial aspect of software engineering, aiding in the early identification of potential defects. This proactive approach significantly contributes to enhancing software quality and reliability. However, a common challenge in SFP is class imbalance (CI). Ensemble...

Full description

Saved in:
Bibliographic Details
Main Authors: Hanan Sharif Alsorory, Mohammad Alshraideh
Format: Article
Language:English
Published: Wiley 2024-01-01
Series:Applied Computational Intelligence and Soft Computing
Online Access:http://dx.doi.org/10.1155/2024/2959582
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850170913414184960
author Hanan Sharif Alsorory
Mohammad Alshraideh
author_facet Hanan Sharif Alsorory
Mohammad Alshraideh
author_sort Hanan Sharif Alsorory
collection DOAJ
description Software fault prediction (SFP) is a crucial aspect of software engineering, aiding in the early identification of potential defects. This proactive approach significantly contributes to enhancing software quality and reliability. However, a common challenge in SFP is class imbalance (CI). Ensemble learning (EL) is a powerful strategy for refining SFP models in object-oriented systems with imbalanced data and improving sensitivity to minority classes. This study aimed to improve the effectiveness of ensemble classes in SFP within object-oriented systems, tackling the challenges associated with imbalanced data. It focuses on enhancing the performance of three ensemble classifiers, BalancedBagging, RUSBoost, and EasyEnsemble, explicitly designed for imbalanced datasets. In Enhanced_BalancedBagging (E_BB) and ROSBoost, random undersampling (RUS) is substituted with random oversampling (ROS). Meanwhile, Enhanced_EasyEnsemble (E_EE) replaces RUS with ROS and AdaBoost with XGBoost. The experimental results demonstrate the superior performance of E_BB, ROSBoost, and E_EE over their base models, achieving the highest F-measure, balanced accuracy, and AUC. Statistical tests, such as the Wilcoxon signed-rank test, provide robust support for the enhanced models, highlighting their practical significance through substantial improvements in F-measure and AUC, as indicated by low negative rank sums and large effect sizes.
format Article
id doaj-art-ac9d00f04bb446659cf71925720cee80
institution OA Journals
issn 1687-9732
language English
publishDate 2024-01-01
publisher Wiley
record_format Article
series Applied Computational Intelligence and Soft Computing
spelling doaj-art-ac9d00f04bb446659cf71925720cee802025-08-20T02:20:22ZengWileyApplied Computational Intelligence and Soft Computing1687-97322024-01-01202410.1155/2024/2959582Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble LearningHanan Sharif Alsorory0Mohammad Alshraideh1Computer Science DepartmentArtificial Intelligence DepartmentSoftware fault prediction (SFP) is a crucial aspect of software engineering, aiding in the early identification of potential defects. This proactive approach significantly contributes to enhancing software quality and reliability. However, a common challenge in SFP is class imbalance (CI). Ensemble learning (EL) is a powerful strategy for refining SFP models in object-oriented systems with imbalanced data and improving sensitivity to minority classes. This study aimed to improve the effectiveness of ensemble classes in SFP within object-oriented systems, tackling the challenges associated with imbalanced data. It focuses on enhancing the performance of three ensemble classifiers, BalancedBagging, RUSBoost, and EasyEnsemble, explicitly designed for imbalanced datasets. In Enhanced_BalancedBagging (E_BB) and ROSBoost, random undersampling (RUS) is substituted with random oversampling (ROS). Meanwhile, Enhanced_EasyEnsemble (E_EE) replaces RUS with ROS and AdaBoost with XGBoost. The experimental results demonstrate the superior performance of E_BB, ROSBoost, and E_EE over their base models, achieving the highest F-measure, balanced accuracy, and AUC. Statistical tests, such as the Wilcoxon signed-rank test, provide robust support for the enhanced models, highlighting their practical significance through substantial improvements in F-measure and AUC, as indicated by low negative rank sums and large effect sizes.http://dx.doi.org/10.1155/2024/2959582
spellingShingle Hanan Sharif Alsorory
Mohammad Alshraideh
Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble Learning
Applied Computational Intelligence and Soft Computing
title Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble Learning
title_full Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble Learning
title_fullStr Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble Learning
title_full_unstemmed Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble Learning
title_short Boosting Software Fault Prediction: Addressing Class Imbalance With Enhanced Ensemble Learning
title_sort boosting software fault prediction addressing class imbalance with enhanced ensemble learning
url http://dx.doi.org/10.1155/2024/2959582
work_keys_str_mv AT hanansharifalsorory boostingsoftwarefaultpredictionaddressingclassimbalancewithenhancedensemblelearning
AT mohammadalshraideh boostingsoftwarefaultpredictionaddressingclassimbalancewithenhancedensemblelearning