Metric-based defect prediction from class diagram

A software defect refers to a fault, failure, or error in software. With the rapid development and increasing reliance on software products, it is essential to identify these defects as early and easily as possible, given the efforts and budget invested in their creation and maintenance. In the lite...

Full description

Saved in:
Bibliographic Details
Main Authors: Batnyam Battulga, Lkhamrolom Tsoodol, Enkhzol Dovdon, Naranchimeg Bold, Oyun-Erdene Namsrai
Format: Article
Language:English
Published: Elsevier 2025-09-01
Series:Array
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2590005625000657
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850082740679999488
author Batnyam Battulga
Lkhamrolom Tsoodol
Enkhzol Dovdon
Naranchimeg Bold
Oyun-Erdene Namsrai
author_facet Batnyam Battulga
Lkhamrolom Tsoodol
Enkhzol Dovdon
Naranchimeg Bold
Oyun-Erdene Namsrai
author_sort Batnyam Battulga
collection DOAJ
description A software defect refers to a fault, failure, or error in software. With the rapid development and increasing reliance on software products, it is essential to identify these defects as early and easily as possible, given the efforts and budget invested in their creation and maintenance. In the literature, various approaches such as machine learning (ML) and deep learning (DL), have been proposed and proven effective in detecting defects in source code during the implementation or testing phases of the software development life cycle (SDLC). A promising approach is crucial for predicting defects at earlier stages of the SDLC, particularly during the design phase, with the goal of enhancing software quality while reducing time, effort, and costs. Meanwhile, software metrics provide a quantifiable way to analyze the software, making it easier to identify defects. Many researchers have leveraged these metrics to predict defects using ML and DL methods, achieving state-of-the-art performance. The objective of this paper is to present a novel approach to predict defects in class diagram (i.e., at design stage) using ML and DL with software metrics. Due to a lack of defect datasets extracted from class diagram, firstly, we created a model-based metric dataset using reverse engineering from a code-based dataset. Then, we apply various ML and DL techniques to the newly created dataset to predict defects in classes by classifying them as either defective or clean. The study utilizes a large dataset called the Unified Bug Dataset, which comprises five publicly available sub-datasets. We compare ML and DL models in terms of accuracy, precision, recall, F-measure, AUC and provide a performance comparison against code-based methods. Finally, we conducted a cross-dataset experiment to evaluate the generalizability of our approach.
format Article
id doaj-art-420ffc97fa914c909f824e05a3422d4e
institution DOAJ
issn 2590-0056
language English
publishDate 2025-09-01
publisher Elsevier
record_format Article
series Array
spelling doaj-art-420ffc97fa914c909f824e05a3422d4e2025-08-20T02:44:28ZengElsevierArray2590-00562025-09-012710043810.1016/j.array.2025.100438Metric-based defect prediction from class diagramBatnyam Battulga0Lkhamrolom Tsoodol1Enkhzol Dovdon2Naranchimeg Bold3Oyun-Erdene Namsrai4Department of Information and Computer Sciences, School of Information Technology and Electronics, National University of Mongolia, MongoliaDepartment of Information and Computer Sciences, School of Information Technology and Electronics, National University of Mongolia, MongoliaDepartment of Information and Computer Sciences, School of Information Technology and Electronics, National University of Mongolia, MongoliaCorresponding author.; Department of Information and Computer Sciences, School of Information Technology and Electronics, National University of Mongolia, MongoliaDepartment of Information and Computer Sciences, School of Information Technology and Electronics, National University of Mongolia, MongoliaA software defect refers to a fault, failure, or error in software. With the rapid development and increasing reliance on software products, it is essential to identify these defects as early and easily as possible, given the efforts and budget invested in their creation and maintenance. In the literature, various approaches such as machine learning (ML) and deep learning (DL), have been proposed and proven effective in detecting defects in source code during the implementation or testing phases of the software development life cycle (SDLC). A promising approach is crucial for predicting defects at earlier stages of the SDLC, particularly during the design phase, with the goal of enhancing software quality while reducing time, effort, and costs. Meanwhile, software metrics provide a quantifiable way to analyze the software, making it easier to identify defects. Many researchers have leveraged these metrics to predict defects using ML and DL methods, achieving state-of-the-art performance. The objective of this paper is to present a novel approach to predict defects in class diagram (i.e., at design stage) using ML and DL with software metrics. Due to a lack of defect datasets extracted from class diagram, firstly, we created a model-based metric dataset using reverse engineering from a code-based dataset. Then, we apply various ML and DL techniques to the newly created dataset to predict defects in classes by classifying them as either defective or clean. The study utilizes a large dataset called the Unified Bug Dataset, which comprises five publicly available sub-datasets. We compare ML and DL models in terms of accuracy, precision, recall, F-measure, AUC and provide a performance comparison against code-based methods. Finally, we conducted a cross-dataset experiment to evaluate the generalizability of our approach.http://www.sciencedirect.com/science/article/pii/S2590005625000657Software defect predictionUML Class diagramMachine learningDeep learningObject oriented design metricsReverse-engineering
spellingShingle Batnyam Battulga
Lkhamrolom Tsoodol
Enkhzol Dovdon
Naranchimeg Bold
Oyun-Erdene Namsrai
Metric-based defect prediction from class diagram
Array
Software defect prediction
UML Class diagram
Machine learning
Deep learning
Object oriented design metrics
Reverse-engineering
title Metric-based defect prediction from class diagram
title_full Metric-based defect prediction from class diagram
title_fullStr Metric-based defect prediction from class diagram
title_full_unstemmed Metric-based defect prediction from class diagram
title_short Metric-based defect prediction from class diagram
title_sort metric based defect prediction from class diagram
topic Software defect prediction
UML Class diagram
Machine learning
Deep learning
Object oriented design metrics
Reverse-engineering
url http://www.sciencedirect.com/science/article/pii/S2590005625000657
work_keys_str_mv AT batnyambattulga metricbaseddefectpredictionfromclassdiagram
AT lkhamrolomtsoodol metricbaseddefectpredictionfromclassdiagram
AT enkhzoldovdon metricbaseddefectpredictionfromclassdiagram
AT naranchimegbold metricbaseddefectpredictionfromclassdiagram
AT oyunerdenenamsrai metricbaseddefectpredictionfromclassdiagram