Deep Learning-Based Feature Extraction Technique for Single Document Summarization Using Hybrid Optimization Technique
Presently, the exponential growth of unstructured data on the web and social networks has made it increasingly challenging for individuals to retrieve relevant information efficiently. Over the years, various text summarization techniques have been developed to address this issue. However, tradition...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10870163/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823859641686163456 |
---|---|
author | Jyotirmayee Rautaray Sangram Panigrahi Ajit Kumar Nayak Premananda Sahu Kaushik Mishra |
author_facet | Jyotirmayee Rautaray Sangram Panigrahi Ajit Kumar Nayak Premananda Sahu Kaushik Mishra |
author_sort | Jyotirmayee Rautaray |
collection | DOAJ |
description | Presently, the exponential growth of unstructured data on the web and social networks has made it increasingly challenging for individuals to retrieve relevant information efficiently. Over the years, various text summarization techniques have been developed to address this issue. However, traditional approaches that rely on directly extracting words often lead to redundancies and fail to establish a strong connection between the summary and the original document. This paper presents a novel Deep Learning (DL)-based text summarization approach incorporating the following phases: pre-processing, feature extraction, vectorization, and summarization using a hybrid Cat Swarm Optimization (CSO) and Harris Hawk Optimization (HHO) algorithm. Initially, input documents undergo pre-processing steps, including sentence segmentation, word tokenization, stop word removal, and lemmatization, to enhance text quality. Features are then extracted using a Restricted Boltzmann Machine (RBM) to obtain nine key attributes. Vectorization is performed using Term Frequency-Inverse Document Frequency (TF-IDF) to represent sentences in vector form. The hybrid CSO-HHO algorithm is subsequently applied to generate summaries. The proposed method’s efficiency was evaluated using datasets from the Document Understanding Conference (DUC), specifically DUC-2002, DUC-2003, and DUC-2005. Metrics such as sensitivity, readability, coherence, precision, BLEU score, ROUGE score, and F-score were analyzed to assess performance. The proposed approach’s results were compared with existing methods, including CSO, QABC, PSO, GJO, FF, and machine learning techniques like SVM and RF. The hybrid CSO-HHO algorithm achieved an accuracy of 99.56%, demonstrating its superiority in text summarization tasks. |
format | Article |
id | doaj-art-c86c49aae2e140569b8cad60f471685e |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-c86c49aae2e140569b8cad60f471685e2025-02-11T00:01:19ZengIEEEIEEE Access2169-35362025-01-0113245152452910.1109/ACCESS.2025.353816910870163Deep Learning-Based Feature Extraction Technique for Single Document Summarization Using Hybrid Optimization TechniqueJyotirmayee Rautaray0https://orcid.org/0000-0003-2747-3919Sangram Panigrahi1Ajit Kumar Nayak2Premananda Sahu3https://orcid.org/0000-0002-9360-8423Kaushik Mishra4https://orcid.org/0000-0001-9499-0727Department of Computer Science and Engineering, Institute of Technical Education and Research, Siksha “O” Anusandhan University, Bhubaneswar, Odisha, IndiaDepartment of Computer Science and Information Technology, Institute of Technical Education and Research, Siksha “O” Anusandhan University, Bhubaneswar, Odisha, IndiaDepartment of Computer Science and Information Technology, Institute of Technical Education and Research, Siksha “O” Anusandhan University, Bhubaneswar, Odisha, IndiaSchool of Computer Science and Engineering, Lovely Professional University, Phagwara, Punjab, IndiaDepartment of Computer Science and Engineering, Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Manipal, IndiaPresently, the exponential growth of unstructured data on the web and social networks has made it increasingly challenging for individuals to retrieve relevant information efficiently. Over the years, various text summarization techniques have been developed to address this issue. However, traditional approaches that rely on directly extracting words often lead to redundancies and fail to establish a strong connection between the summary and the original document. This paper presents a novel Deep Learning (DL)-based text summarization approach incorporating the following phases: pre-processing, feature extraction, vectorization, and summarization using a hybrid Cat Swarm Optimization (CSO) and Harris Hawk Optimization (HHO) algorithm. Initially, input documents undergo pre-processing steps, including sentence segmentation, word tokenization, stop word removal, and lemmatization, to enhance text quality. Features are then extracted using a Restricted Boltzmann Machine (RBM) to obtain nine key attributes. Vectorization is performed using Term Frequency-Inverse Document Frequency (TF-IDF) to represent sentences in vector form. The hybrid CSO-HHO algorithm is subsequently applied to generate summaries. The proposed method’s efficiency was evaluated using datasets from the Document Understanding Conference (DUC), specifically DUC-2002, DUC-2003, and DUC-2005. Metrics such as sensitivity, readability, coherence, precision, BLEU score, ROUGE score, and F-score were analyzed to assess performance. The proposed approach’s results were compared with existing methods, including CSO, QABC, PSO, GJO, FF, and machine learning techniques like SVM and RF. The hybrid CSO-HHO algorithm achieved an accuracy of 99.56%, demonstrating its superiority in text summarization tasks.https://ieeexplore.ieee.org/document/10870163/Text summarizationsingle document summarizationpre-processingfeature extractionvectorizationhybrid CSO-HHO algorithm |
spellingShingle | Jyotirmayee Rautaray Sangram Panigrahi Ajit Kumar Nayak Premananda Sahu Kaushik Mishra Deep Learning-Based Feature Extraction Technique for Single Document Summarization Using Hybrid Optimization Technique IEEE Access Text summarization single document summarization pre-processing feature extraction vectorization hybrid CSO-HHO algorithm |
title | Deep Learning-Based Feature Extraction Technique for Single Document Summarization Using Hybrid Optimization Technique |
title_full | Deep Learning-Based Feature Extraction Technique for Single Document Summarization Using Hybrid Optimization Technique |
title_fullStr | Deep Learning-Based Feature Extraction Technique for Single Document Summarization Using Hybrid Optimization Technique |
title_full_unstemmed | Deep Learning-Based Feature Extraction Technique for Single Document Summarization Using Hybrid Optimization Technique |
title_short | Deep Learning-Based Feature Extraction Technique for Single Document Summarization Using Hybrid Optimization Technique |
title_sort | deep learning based feature extraction technique for single document summarization using hybrid optimization technique |
topic | Text summarization single document summarization pre-processing feature extraction vectorization hybrid CSO-HHO algorithm |
url | https://ieeexplore.ieee.org/document/10870163/ |
work_keys_str_mv | AT jyotirmayeerautaray deeplearningbasedfeatureextractiontechniqueforsingledocumentsummarizationusinghybridoptimizationtechnique AT sangrampanigrahi deeplearningbasedfeatureextractiontechniqueforsingledocumentsummarizationusinghybridoptimizationtechnique AT ajitkumarnayak deeplearningbasedfeatureextractiontechniqueforsingledocumentsummarizationusinghybridoptimizationtechnique AT premanandasahu deeplearningbasedfeatureextractiontechniqueforsingledocumentsummarizationusinghybridoptimizationtechnique AT kaushikmishra deeplearningbasedfeatureextractiontechniqueforsingledocumentsummarizationusinghybridoptimizationtechnique |