Current Trends and Advances in Extractive Text Summarization: A Comprehensive Review

Given the rapid increase of textual data in various fields, text summarization has become essential for efficient information handling. Over recent decades, numerous methods have been proposed to enhance summarization processes, and various review papers and books have been published to encapsulate...

Full description

Saved in:
Bibliographic Details
Main Authors: Maryam Azam, Shah Khalid, Sulaiman Almutairi, Hasan Ali Khattak, Abdallah Namoun, Amjad Ali, Hafiz Syed Muhammad Bilal
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10872906/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850024196722130944
author Maryam Azam
Shah Khalid
Sulaiman Almutairi
Hasan Ali Khattak
Abdallah Namoun
Amjad Ali
Hafiz Syed Muhammad Bilal
author_facet Maryam Azam
Shah Khalid
Sulaiman Almutairi
Hasan Ali Khattak
Abdallah Namoun
Amjad Ali
Hafiz Syed Muhammad Bilal
author_sort Maryam Azam
collection DOAJ
description Given the rapid increase of textual data in various fields, text summarization has become essential for efficient information handling. Over recent decades, numerous methods have been proposed to enhance summarization processes, and various review papers and books have been published to encapsulate these methodologies and discuss their implications. However, existing reviews often fail to provide a comprehensive retrospective of recent advancements, particularly concerning detailed architectural frameworks, the field’s current state, evaluation methodologies, and unresolved challenges. This paper addresses this gap by presenting a detailed analysis of the extractive approaches, encompassing their inherent strengths, limitations, and underlying mechanisms. We present a detailed, multi-layered architectural framework designed to advance and develop summarization models, thereby supporting researchers in their endeavors. The text summarization framework consists mainly of text preprocessing, feature extraction, sentence scoring, use of a base model, sentence selection and output summary, and post-processing. Furthermore, this review of 145 research articles categorizes domain-specific summarization techniques, focusing on unique challenges and tailored strategies for news, scientific articles, and social media. These techniques include statistical, fuzzy logic, rule, optimization, graph, clustering-based, machine learning, and deep learning. We emphasize the impact of evaluation metrics and benchmark datasets in performance assessment, providing a detailed analysis of the commonly utilized datasets and metrics (mainly ROUGE-1, ROUGE-2, ROUGE-L, and ROUGE-S) in the current literature. This review article is a valuable resource for advancing text summarization techniques in natural language processing and machine learning by identifying future research directions and open challenges. Notable challenges include expanding summarization for complex tasks, multiple documents, multimodal user input, multi-format and multilingual data, refining the stopping criteria, and improving the evaluation metrics.
format Article
id doaj-art-364b2ac9964a486a855017936e3ebf9e
institution DOAJ
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-364b2ac9964a486a855017936e3ebf9e2025-08-20T03:01:11ZengIEEEIEEE Access2169-35362025-01-0113281502816610.1109/ACCESS.2025.353888610872906Current Trends and Advances in Extractive Text Summarization: A Comprehensive ReviewMaryam Azam0https://orcid.org/0009-0002-5265-2695Shah Khalid1https://orcid.org/0000-0001-5735-5863Sulaiman Almutairi2https://orcid.org/0000-0003-4810-6018Hasan Ali Khattak3https://orcid.org/0000-0002-8198-9265Abdallah Namoun4https://orcid.org/0000-0002-7050-0532Amjad Ali5https://orcid.org/0000-0001-9117-3692Hafiz Syed Muhammad Bilal6School of Electrical Engineering and Computer Science, National University of Sciences and Technology (NUST), Islamabad, PakistanSchool of Electrical Engineering and Computer Science, National University of Sciences and Technology (NUST), Islamabad, PakistanDepartment of Health Informatics, College of Public Health and Health Informatics, Qassim University, Buraydah, Qassim, Saudi ArabiaSchool of Electrical Engineering and Computer Science, National University of Sciences and Technology (NUST), Islamabad, PakistanAI Center, Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah, Saudi ArabiaDepartment of Computer and Software Technologies, University of Swat, Swat, Khyber Pakhtunkhwa, PakistanSchool of Electrical Engineering and Computer Science, National University of Sciences and Technology (NUST), Islamabad, PakistanGiven the rapid increase of textual data in various fields, text summarization has become essential for efficient information handling. Over recent decades, numerous methods have been proposed to enhance summarization processes, and various review papers and books have been published to encapsulate these methodologies and discuss their implications. However, existing reviews often fail to provide a comprehensive retrospective of recent advancements, particularly concerning detailed architectural frameworks, the field’s current state, evaluation methodologies, and unresolved challenges. This paper addresses this gap by presenting a detailed analysis of the extractive approaches, encompassing their inherent strengths, limitations, and underlying mechanisms. We present a detailed, multi-layered architectural framework designed to advance and develop summarization models, thereby supporting researchers in their endeavors. The text summarization framework consists mainly of text preprocessing, feature extraction, sentence scoring, use of a base model, sentence selection and output summary, and post-processing. Furthermore, this review of 145 research articles categorizes domain-specific summarization techniques, focusing on unique challenges and tailored strategies for news, scientific articles, and social media. These techniques include statistical, fuzzy logic, rule, optimization, graph, clustering-based, machine learning, and deep learning. We emphasize the impact of evaluation metrics and benchmark datasets in performance assessment, providing a detailed analysis of the commonly utilized datasets and metrics (mainly ROUGE-1, ROUGE-2, ROUGE-L, and ROUGE-S) in the current literature. This review article is a valuable resource for advancing text summarization techniques in natural language processing and machine learning by identifying future research directions and open challenges. Notable challenges include expanding summarization for complex tasks, multiple documents, multimodal user input, multi-format and multilingual data, refining the stopping criteria, and improving the evaluation metrics.https://ieeexplore.ieee.org/document/10872906/Surveytext summarizationtransformer-based modelsdomain-specific summarizationgeneric architecturedatasets and evaluation measures
spellingShingle Maryam Azam
Shah Khalid
Sulaiman Almutairi
Hasan Ali Khattak
Abdallah Namoun
Amjad Ali
Hafiz Syed Muhammad Bilal
Current Trends and Advances in Extractive Text Summarization: A Comprehensive Review
IEEE Access
Survey
text summarization
transformer-based models
domain-specific summarization
generic architecture
datasets and evaluation measures
title Current Trends and Advances in Extractive Text Summarization: A Comprehensive Review
title_full Current Trends and Advances in Extractive Text Summarization: A Comprehensive Review
title_fullStr Current Trends and Advances in Extractive Text Summarization: A Comprehensive Review
title_full_unstemmed Current Trends and Advances in Extractive Text Summarization: A Comprehensive Review
title_short Current Trends and Advances in Extractive Text Summarization: A Comprehensive Review
title_sort current trends and advances in extractive text summarization a comprehensive review
topic Survey
text summarization
transformer-based models
domain-specific summarization
generic architecture
datasets and evaluation measures
url https://ieeexplore.ieee.org/document/10872906/
work_keys_str_mv AT maryamazam currenttrendsandadvancesinextractivetextsummarizationacomprehensivereview
AT shahkhalid currenttrendsandadvancesinextractivetextsummarizationacomprehensivereview
AT sulaimanalmutairi currenttrendsandadvancesinextractivetextsummarizationacomprehensivereview
AT hasanalikhattak currenttrendsandadvancesinextractivetextsummarizationacomprehensivereview
AT abdallahnamoun currenttrendsandadvancesinextractivetextsummarizationacomprehensivereview
AT amjadali currenttrendsandadvancesinextractivetextsummarizationacomprehensivereview
AT hafizsyedmuhammadbilal currenttrendsandadvancesinextractivetextsummarizationacomprehensivereview