Comparative analysis of impact of classification algorithms on security and performance bug reports
Identification and classification of bugs, e.g., security and performance are a preemptive and fundamental practice which contributes to the development of secure and efficient software. Software Quality Assurance (SQA) needs to classify bugs into relevant categories, e.g., security and performance...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
De Gruyter
2024-12-01
|
| Series: | Journal of Intelligent Systems |
| Subjects: | |
| Online Access: | https://doi.org/10.1515/jisys-2024-0045 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850163856270163968 |
|---|---|
| author | Said Maryyam Bin Faiz Rizwan Aljaidi Mohammad Alshammari Muteb |
| author_facet | Said Maryyam Bin Faiz Rizwan Aljaidi Mohammad Alshammari Muteb |
| author_sort | Said Maryyam |
| collection | DOAJ |
| description | Identification and classification of bugs, e.g., security and performance are a preemptive and fundamental practice which contributes to the development of secure and efficient software. Software Quality Assurance (SQA) needs to classify bugs into relevant categories, e.g., security and performance bugs since one type of bug may have a higher preference over another, thus facilitating software evolution and maintenance. In addition to classification, it would be ideal for the SQA manager to prioritize security and performance bugs based on the level of perseverance, severity, or impact to assign relevant developers whose expertise is aligned with the identification of such bugs, thus facilitating triaging. The aim of this research is to compare and analyze the prediction accuracy of machine learning algorithms, i.e., Artificial neural network (ANN), Support vector machine (SVM), Naïve Bayes (NB), Decision tree (DT), Logistic regression (LR), and K-nearest neighbor (KNN) to identify security and performance bugs from the bug repository. We first label the existing dataset from the Bugzilla repository with the help of a software security expert to train the algorithms. Our research type is explanatory, and our research method is controlled experimentation, in which the independent variable is prediction accuracy and the dependent variables are ANN, SVM, NB, DT, LR, and KNN. First, we applied preprocessing, Term Frequency-Inverse Document Frequency feature extraction methods, and then applied classification algorithms. The results were measured through accuracy, precision, recall, and F-measure and then the results were compared and validated through the ten-fold cross-validation technique. Comparative analysis reveals that two algorithms (SVM and LR) perform better in terms of precision (0.99) for performance bugs and three algorithms (SVM, ANN, and LR) perform better in terms of F1 score for security bugs as compared to other classification algorithms which are essentially due to the linear dataset and extensive number of features in the dataset. |
| format | Article |
| id | doaj-art-a1c0b6041b9e475baf722b24b1622073 |
| institution | OA Journals |
| issn | 2191-026X |
| language | English |
| publishDate | 2024-12-01 |
| publisher | De Gruyter |
| record_format | Article |
| series | Journal of Intelligent Systems |
| spelling | doaj-art-a1c0b6041b9e475baf722b24b16220732025-08-20T02:22:06ZengDe GruyterJournal of Intelligent Systems2191-026X2024-12-013311315910.1515/jisys-2024-0045Comparative analysis of impact of classification algorithms on security and performance bug reportsSaid Maryyam0Bin Faiz Rizwan1Aljaidi Mohammad2Alshammari Muteb3Faculty of Computing Riphah International University, Islamabad, 46000, PakistanFaculty of Computing Riphah International University, Islamabad, 46000, PakistanDepartment of Computer Science, Faculty of Information Technology, Zarqa University, Zarqa, 13116, JordanDepartment of Information Technology, Faculty of Computing and Information Technology Northern Border University, Rafha, 91431, Saudi ArabiaIdentification and classification of bugs, e.g., security and performance are a preemptive and fundamental practice which contributes to the development of secure and efficient software. Software Quality Assurance (SQA) needs to classify bugs into relevant categories, e.g., security and performance bugs since one type of bug may have a higher preference over another, thus facilitating software evolution and maintenance. In addition to classification, it would be ideal for the SQA manager to prioritize security and performance bugs based on the level of perseverance, severity, or impact to assign relevant developers whose expertise is aligned with the identification of such bugs, thus facilitating triaging. The aim of this research is to compare and analyze the prediction accuracy of machine learning algorithms, i.e., Artificial neural network (ANN), Support vector machine (SVM), Naïve Bayes (NB), Decision tree (DT), Logistic regression (LR), and K-nearest neighbor (KNN) to identify security and performance bugs from the bug repository. We first label the existing dataset from the Bugzilla repository with the help of a software security expert to train the algorithms. Our research type is explanatory, and our research method is controlled experimentation, in which the independent variable is prediction accuracy and the dependent variables are ANN, SVM, NB, DT, LR, and KNN. First, we applied preprocessing, Term Frequency-Inverse Document Frequency feature extraction methods, and then applied classification algorithms. The results were measured through accuracy, precision, recall, and F-measure and then the results were compared and validated through the ten-fold cross-validation technique. Comparative analysis reveals that two algorithms (SVM and LR) perform better in terms of precision (0.99) for performance bugs and three algorithms (SVM, ANN, and LR) perform better in terms of F1 score for security bugs as compared to other classification algorithms which are essentially due to the linear dataset and extensive number of features in the dataset.https://doi.org/10.1515/jisys-2024-0045bug classificationsecurity bugperformance bugtext miningbug prediction |
| spellingShingle | Said Maryyam Bin Faiz Rizwan Aljaidi Mohammad Alshammari Muteb Comparative analysis of impact of classification algorithms on security and performance bug reports Journal of Intelligent Systems bug classification security bug performance bug text mining bug prediction |
| title | Comparative analysis of impact of classification algorithms on security and performance bug reports |
| title_full | Comparative analysis of impact of classification algorithms on security and performance bug reports |
| title_fullStr | Comparative analysis of impact of classification algorithms on security and performance bug reports |
| title_full_unstemmed | Comparative analysis of impact of classification algorithms on security and performance bug reports |
| title_short | Comparative analysis of impact of classification algorithms on security and performance bug reports |
| title_sort | comparative analysis of impact of classification algorithms on security and performance bug reports |
| topic | bug classification security bug performance bug text mining bug prediction |
| url | https://doi.org/10.1515/jisys-2024-0045 |
| work_keys_str_mv | AT saidmaryyam comparativeanalysisofimpactofclassificationalgorithmsonsecurityandperformancebugreports AT binfaizrizwan comparativeanalysisofimpactofclassificationalgorithmsonsecurityandperformancebugreports AT aljaidimohammad comparativeanalysisofimpactofclassificationalgorithmsonsecurityandperformancebugreports AT alshammarimuteb comparativeanalysisofimpactofclassificationalgorithmsonsecurityandperformancebugreports |