Evaluating the Predictive Power of Software Metrics for Fault Localization

Fault localization remains a critical challenge in software engineering, directly impacting debugging efficiency and software quality. This study investigates the predictive power of various software metrics for fault localization by framing the task as a multi-class classification problem and evalu...

Full description

Saved in:

Bibliographic Details
Main Authors:	Issar Arab, Kenneth Magel, Mohammed Akour
Format:	Article
Language:	English
Published:	MDPI AG 2025-06-01
Series:	Computers
Subjects:	fault localization software quality assurance machine learning software metrics test coverage automated debugging
Online Access:	https://www.mdpi.com/2073-431X/14/6/222
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849431860871954432
author	Issar Arab Kenneth Magel Mohammed Akour
author_facet	Issar Arab Kenneth Magel Mohammed Akour
author_sort	Issar Arab
collection	DOAJ
description	Fault localization remains a critical challenge in software engineering, directly impacting debugging efficiency and software quality. This study investigates the predictive power of various software metrics for fault localization by framing the task as a multi-class classification problem and evaluating it using the Defects4J dataset. We fitted thousands of models and benchmarked different algorithms—including deep learning, Random Forest, XGBoost, and LightGBM—to choose the best-performing model. To enhance model transparency, we applied explainable AI techniques to analyze feature importance. The results revealed that test suite metrics consistently outperform static and dynamic metrics, making them the most effective predictors for identifying faulty classes. These findings underscore the critical role of test quality and coverage in automated fault localization. By combining machine learning with transparent feature analysis, this work delivers practical insights to support more efficient debugging workflows. It lays the groundwork for an iterative process that integrates metric-based predictive models with large language models (LLMs), enabling future systems to automatically generate targeted test cases for the most fault-prone components, which further enhances the automation and precision of software testing.
format	Article
id	doaj-art-ca95ec42e9354fa287f46b3e4de7ff02
institution	Kabale University
issn	2073-431X
language	English
publishDate	2025-06-01
publisher	MDPI AG
record_format	Article
series	Computers
spelling	doaj-art-ca95ec42e9354fa287f46b3e4de7ff022025-08-20T03:27:30ZengMDPI AGComputers2073-431X2025-06-0114622210.3390/computers14060222Evaluating the Predictive Power of Software Metrics for Fault LocalizationIssar Arab0Kenneth Magel1Mohammed Akour2Adrem Data Laboratory, Department of Computer Science, University of Antwerp, 2020 Antwerpen, BelgiumFaculty of Computer Science, North Dakota State University, Fargo, ND 58108, USACollege of Computer and Information Sciences, Prince Sultan University, Riyadh 12435, Saudi ArabiaFault localization remains a critical challenge in software engineering, directly impacting debugging efficiency and software quality. This study investigates the predictive power of various software metrics for fault localization by framing the task as a multi-class classification problem and evaluating it using the Defects4J dataset. We fitted thousands of models and benchmarked different algorithms—including deep learning, Random Forest, XGBoost, and LightGBM—to choose the best-performing model. To enhance model transparency, we applied explainable AI techniques to analyze feature importance. The results revealed that test suite metrics consistently outperform static and dynamic metrics, making them the most effective predictors for identifying faulty classes. These findings underscore the critical role of test quality and coverage in automated fault localization. By combining machine learning with transparent feature analysis, this work delivers practical insights to support more efficient debugging workflows. It lays the groundwork for an iterative process that integrates metric-based predictive models with large language models (LLMs), enabling future systems to automatically generate targeted test cases for the most fault-prone components, which further enhances the automation and precision of software testing.https://www.mdpi.com/2073-431X/14/6/222fault localizationsoftware quality assurancemachine learningsoftware metricstest coverageautomated debugging
spellingShingle	Issar Arab Kenneth Magel Mohammed Akour Evaluating the Predictive Power of Software Metrics for Fault Localization Computers fault localization software quality assurance machine learning software metrics test coverage automated debugging
title	Evaluating the Predictive Power of Software Metrics for Fault Localization
title_full	Evaluating the Predictive Power of Software Metrics for Fault Localization
title_fullStr	Evaluating the Predictive Power of Software Metrics for Fault Localization
title_full_unstemmed	Evaluating the Predictive Power of Software Metrics for Fault Localization
title_short	Evaluating the Predictive Power of Software Metrics for Fault Localization
title_sort	evaluating the predictive power of software metrics for fault localization
topic	fault localization software quality assurance machine learning software metrics test coverage automated debugging
url	https://www.mdpi.com/2073-431X/14/6/222
work_keys_str_mv	AT issararab evaluatingthepredictivepowerofsoftwaremetricsforfaultlocalization AT kennethmagel evaluatingthepredictivepowerofsoftwaremetricsforfaultlocalization AT mohammedakour evaluatingthepredictivepowerofsoftwaremetricsforfaultlocalization

Evaluating the Predictive Power of Software Metrics for Fault Localization

Similar Items