A Multidocument Summarization Technique for Informative Bug Summaries

To help developers grasp bug information, bug summaries should contain bug descriptions and information on the reproduction steps, environment, and solutions to be informative for developers. However, previously established bug report summarization techniques generate bug summaries mainly by identif...

Full description

Saved in:
Bibliographic Details
Main Authors: Samal Mukhtar, Seonah Lee, Jueun Heo
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10737053/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850150842749943808
author Samal Mukhtar
Seonah Lee
Jueun Heo
author_facet Samal Mukhtar
Seonah Lee
Jueun Heo
author_sort Samal Mukhtar
collection DOAJ
description To help developers grasp bug information, bug summaries should contain bug descriptions and information on the reproduction steps, environment, and solutions to be informative for developers. However, previously established bug report summarization techniques generate bug summaries mainly by identifying significant sentences, which often miss those bug report attributes. In this paper, we aim to generate informative summaries that cover these specific bug report attributes in a structured form. There are two challenges. First, the relevant information is sometimes scattered over multiple sources. Second, information on the reproduction steps and environment is often filtered out by previous techniques, which identify significant sentences on the basis of their relationships. Therefore, we propose a bug summarization technique that collects information from multiple sources, including duplicates and pull requests, and a classification technique for identifying sentences that provide relevant information on the reproduction steps and environment. Our proposed technique, ClaSum, consists of four steps: preprocessing, classification, sentence ranking, and summarization. We adopted RoBERTa for our classification step, Opinion and Topic association scores for the sentence ranking step, and BART for the summarization step. Our comparative experiments show that our technique outperforms the state-of-the-art technique BugSum in terms of the F1 score by 14%, 8%, and 35% on the SDS, ADS, and DDS datasets, respectively. Additionally, our qualitative investigation shows that our technique generates a more structural summary than two well-known LLMs, Gemini and Claude.
format Article
id doaj-art-61b2feed34fc4d499f6154e20dd2d40e
institution OA Journals
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-61b2feed34fc4d499f6154e20dd2d40e2025-08-20T02:26:27ZengIEEEIEEE Access2169-35362024-01-011215890815892610.1109/ACCESS.2024.348744310737053A Multidocument Summarization Technique for Informative Bug SummariesSamal Mukhtar0Seonah Lee1https://orcid.org/0000-0002-2004-2924Jueun Heo2Department of AI Convergence Engineering, Gyeongsang National University, Jinju-si, Republic of KoreaDepartment of AI Convergence Engineering, Gyeongsang National University, Jinju-si, Republic of KoreaDepartment of AI Convergence Engineering, Gyeongsang National University, Jinju-si, Republic of KoreaTo help developers grasp bug information, bug summaries should contain bug descriptions and information on the reproduction steps, environment, and solutions to be informative for developers. However, previously established bug report summarization techniques generate bug summaries mainly by identifying significant sentences, which often miss those bug report attributes. In this paper, we aim to generate informative summaries that cover these specific bug report attributes in a structured form. There are two challenges. First, the relevant information is sometimes scattered over multiple sources. Second, information on the reproduction steps and environment is often filtered out by previous techniques, which identify significant sentences on the basis of their relationships. Therefore, we propose a bug summarization technique that collects information from multiple sources, including duplicates and pull requests, and a classification technique for identifying sentences that provide relevant information on the reproduction steps and environment. Our proposed technique, ClaSum, consists of four steps: preprocessing, classification, sentence ranking, and summarization. We adopted RoBERTa for our classification step, Opinion and Topic association scores for the sentence ranking step, and BART for the summarization step. Our comparative experiments show that our technique outperforms the state-of-the-art technique BugSum in terms of the F1 score by 14%, 8%, and 35% on the SDS, ADS, and DDS datasets, respectively. Additionally, our qualitative investigation shows that our technique generates a more structural summary than two well-known LLMs, Gemini and Claude.https://ieeexplore.ieee.org/document/10737053/Bug report summarizationclassificationcombinationbug summaries
spellingShingle Samal Mukhtar
Seonah Lee
Jueun Heo
A Multidocument Summarization Technique for Informative Bug Summaries
IEEE Access
Bug report summarization
classification
combination
bug summaries
title A Multidocument Summarization Technique for Informative Bug Summaries
title_full A Multidocument Summarization Technique for Informative Bug Summaries
title_fullStr A Multidocument Summarization Technique for Informative Bug Summaries
title_full_unstemmed A Multidocument Summarization Technique for Informative Bug Summaries
title_short A Multidocument Summarization Technique for Informative Bug Summaries
title_sort multidocument summarization technique for informative bug summaries
topic Bug report summarization
classification
combination
bug summaries
url https://ieeexplore.ieee.org/document/10737053/
work_keys_str_mv AT samalmukhtar amultidocumentsummarizationtechniqueforinformativebugsummaries
AT seonahlee amultidocumentsummarizationtechniqueforinformativebugsummaries
AT jueunheo amultidocumentsummarizationtechniqueforinformativebugsummaries
AT samalmukhtar multidocumentsummarizationtechniqueforinformativebugsummaries
AT seonahlee multidocumentsummarizationtechniqueforinformativebugsummaries
AT jueunheo multidocumentsummarizationtechniqueforinformativebugsummaries