Leveraging Machine Learning for Enhanced Bug Triaging in Open-Source Software Projects
Bug triaging–the process of classifying and assigning software issues to appropriate developers–is a critical yet challenging task in large-scale software development. Manual triaging is time-consuming, inconsistent, and prone to human bias, which often delays issue resolution...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11106424/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Bug triaging–the process of classifying and assigning software issues to appropriate developers–is a critical yet challenging task in large-scale software development. Manual triaging is time-consuming, inconsistent, and prone to human bias, which often delays issue resolution and misallocates developer resources. This study explores the application of machine learning to automate and improve bug triaging efficiency and accuracy. Using a dataset of over 122,000 issues from the microsoft/vscode GitHub repository, we evaluate several machine learning models including Bidirectional LSTM, CNN-LSTM, Random Forest, and Multinomial Naive Bayes. Our primary contribution is the development of an Augmented Bidirectional LSTM model that integrates enriched textual features and contextual metadata. This model, optimized using Optuna, outperforms traditional baselines, achieving a Micro F1-score of 0.6469 and Hamming Loss of 0.0133 for label prediction, and a Micro F1-score of 0.5974 with Hamming Loss of 0.0062 for assignee recommendation. In addition to demonstrating strong predictive performance, we present a robust end-to-end pipeline for data preprocessing, augmentation, model training, and evaluation using multi-label classification techniques. The study highlights how deep learning architectures, in combination with feature engineering and hyperparameter tuning, can provide scalable and generalizable components to support the automation of bug triaging. These findings contribute to the growing field of intelligent software maintenance by offering data-driven approaches that can support developer workflows and improve issue management efficiency in open-source environments. |
|---|---|
| ISSN: | 2169-3536 |