A Multidocument Summarization Technique for Informative Bug Summaries
To help developers grasp bug information, bug summaries should contain bug descriptions and information on the reproduction steps, environment, and solutions to be informative for developers. However, previously established bug report summarization techniques generate bug summaries mainly by identif...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10737053/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850150842749943808 |
|---|---|
| author | Samal Mukhtar Seonah Lee Jueun Heo |
| author_facet | Samal Mukhtar Seonah Lee Jueun Heo |
| author_sort | Samal Mukhtar |
| collection | DOAJ |
| description | To help developers grasp bug information, bug summaries should contain bug descriptions and information on the reproduction steps, environment, and solutions to be informative for developers. However, previously established bug report summarization techniques generate bug summaries mainly by identifying significant sentences, which often miss those bug report attributes. In this paper, we aim to generate informative summaries that cover these specific bug report attributes in a structured form. There are two challenges. First, the relevant information is sometimes scattered over multiple sources. Second, information on the reproduction steps and environment is often filtered out by previous techniques, which identify significant sentences on the basis of their relationships. Therefore, we propose a bug summarization technique that collects information from multiple sources, including duplicates and pull requests, and a classification technique for identifying sentences that provide relevant information on the reproduction steps and environment. Our proposed technique, ClaSum, consists of four steps: preprocessing, classification, sentence ranking, and summarization. We adopted RoBERTa for our classification step, Opinion and Topic association scores for the sentence ranking step, and BART for the summarization step. Our comparative experiments show that our technique outperforms the state-of-the-art technique BugSum in terms of the F1 score by 14%, 8%, and 35% on the SDS, ADS, and DDS datasets, respectively. Additionally, our qualitative investigation shows that our technique generates a more structural summary than two well-known LLMs, Gemini and Claude. |
| format | Article |
| id | doaj-art-61b2feed34fc4d499f6154e20dd2d40e |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-61b2feed34fc4d499f6154e20dd2d40e2025-08-20T02:26:27ZengIEEEIEEE Access2169-35362024-01-011215890815892610.1109/ACCESS.2024.348744310737053A Multidocument Summarization Technique for Informative Bug SummariesSamal Mukhtar0Seonah Lee1https://orcid.org/0000-0002-2004-2924Jueun Heo2Department of AI Convergence Engineering, Gyeongsang National University, Jinju-si, Republic of KoreaDepartment of AI Convergence Engineering, Gyeongsang National University, Jinju-si, Republic of KoreaDepartment of AI Convergence Engineering, Gyeongsang National University, Jinju-si, Republic of KoreaTo help developers grasp bug information, bug summaries should contain bug descriptions and information on the reproduction steps, environment, and solutions to be informative for developers. However, previously established bug report summarization techniques generate bug summaries mainly by identifying significant sentences, which often miss those bug report attributes. In this paper, we aim to generate informative summaries that cover these specific bug report attributes in a structured form. There are two challenges. First, the relevant information is sometimes scattered over multiple sources. Second, information on the reproduction steps and environment is often filtered out by previous techniques, which identify significant sentences on the basis of their relationships. Therefore, we propose a bug summarization technique that collects information from multiple sources, including duplicates and pull requests, and a classification technique for identifying sentences that provide relevant information on the reproduction steps and environment. Our proposed technique, ClaSum, consists of four steps: preprocessing, classification, sentence ranking, and summarization. We adopted RoBERTa for our classification step, Opinion and Topic association scores for the sentence ranking step, and BART for the summarization step. Our comparative experiments show that our technique outperforms the state-of-the-art technique BugSum in terms of the F1 score by 14%, 8%, and 35% on the SDS, ADS, and DDS datasets, respectively. Additionally, our qualitative investigation shows that our technique generates a more structural summary than two well-known LLMs, Gemini and Claude.https://ieeexplore.ieee.org/document/10737053/Bug report summarizationclassificationcombinationbug summaries |
| spellingShingle | Samal Mukhtar Seonah Lee Jueun Heo A Multidocument Summarization Technique for Informative Bug Summaries IEEE Access Bug report summarization classification combination bug summaries |
| title | A Multidocument Summarization Technique for Informative Bug Summaries |
| title_full | A Multidocument Summarization Technique for Informative Bug Summaries |
| title_fullStr | A Multidocument Summarization Technique for Informative Bug Summaries |
| title_full_unstemmed | A Multidocument Summarization Technique for Informative Bug Summaries |
| title_short | A Multidocument Summarization Technique for Informative Bug Summaries |
| title_sort | multidocument summarization technique for informative bug summaries |
| topic | Bug report summarization classification combination bug summaries |
| url | https://ieeexplore.ieee.org/document/10737053/ |
| work_keys_str_mv | AT samalmukhtar amultidocumentsummarizationtechniqueforinformativebugsummaries AT seonahlee amultidocumentsummarizationtechniqueforinformativebugsummaries AT jueunheo amultidocumentsummarizationtechniqueforinformativebugsummaries AT samalmukhtar multidocumentsummarizationtechniqueforinformativebugsummaries AT seonahlee multidocumentsummarizationtechniqueforinformativebugsummaries AT jueunheo multidocumentsummarizationtechniqueforinformativebugsummaries |