Large scale summarization using ensemble prompts and in context learning approaches

Abstract The field of Information Assurance (IA) and Cybersecurity has seen substantial evolution, driven by advancements in technology and the increasing sophistication of threats in the digital age. This study employs Large Language Models (LLMs), as well as other advanced NLP techniques, to condu...

Full description

Saved in:

Bibliographic Details
Main Authors:	Andrés Leiva-Araos, Bady Gana, Héctor Allende-Cid, José García, Manob Jyoti Saikia
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-03-01
Series:	Scientific Reports
Subjects:	Information assurance Cybersecurity trends Systematic topic review Large language models Natural language processing (NLP) Automatic summarization
Online Access:	https://doi.org/10.1038/s41598-025-94551-8
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850208401996382208
author	Andrés Leiva-Araos Bady Gana Héctor Allende-Cid José García Manob Jyoti Saikia
author_facet	Andrés Leiva-Araos Bady Gana Héctor Allende-Cid José García Manob Jyoti Saikia
author_sort	Andrés Leiva-Araos
collection	DOAJ
description	Abstract The field of Information Assurance (IA) and Cybersecurity has seen substantial evolution, driven by advancements in technology and the increasing sophistication of threats in the digital age. This study employs Large Language Models (LLMs), as well as other advanced NLP techniques, to conduct a comprehensive analysis of literature from 1967 to 2024. By analyzing a corpus of more than 62,000 documents extracted from Scopus, our approach involves a comprehensive methodology that includes two main phases: topic detection using BERTopic and automatic summarization with LLMs across various periods (annual and decades). By designing targeted queries to extract relevant papers, analyzing textual data, and applying advanced prompting techniques for summarization, we integrate computational models to handle large volumes of data. Our results demonstrate that an ensemble of methods (Ev2) outperforms traditional summarization and density-based approaches, with improvements ranging from 16.7% to 29.6% in keyword definition tasks. It generates summaries that outperform in 5 out of the 7 tested metrics while maintaining the logical integrity of bibliographic references. Our results illuminate the shifts in focus within Information Assurance across decades, revealing key breakthroughs and forecasting emerging areas of significance.
format	Article
id	doaj-art-7f113ceaffae4734a572bcdc94c85e75
institution	OA Journals
issn	2045-2322
language	English
publishDate	2025-03-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj-art-7f113ceaffae4734a572bcdc94c85e752025-08-20T02:10:16ZengNature PortfolioScientific Reports2045-23222025-03-0115112110.1038/s41598-025-94551-8Large scale summarization using ensemble prompts and in context learning approachesAndrés Leiva-Araos0Bady Gana1Héctor Allende-Cid2José García3Manob Jyoti Saikia4Department of Computing, University of North FloridaEscuela de Ingeniería Informática, Pontificia Universidad Católica de ValparaísoEscuela de Ingeniería Informática, Pontificia Universidad Católica de ValparaísoEscuela de Ingeniería en Construcción y Transporte, Pontificia Universidad Católica de ValparaísoDepartment of Electrical Engineering, University of North FloridaAbstract The field of Information Assurance (IA) and Cybersecurity has seen substantial evolution, driven by advancements in technology and the increasing sophistication of threats in the digital age. This study employs Large Language Models (LLMs), as well as other advanced NLP techniques, to conduct a comprehensive analysis of literature from 1967 to 2024. By analyzing a corpus of more than 62,000 documents extracted from Scopus, our approach involves a comprehensive methodology that includes two main phases: topic detection using BERTopic and automatic summarization with LLMs across various periods (annual and decades). By designing targeted queries to extract relevant papers, analyzing textual data, and applying advanced prompting techniques for summarization, we integrate computational models to handle large volumes of data. Our results demonstrate that an ensemble of methods (Ev2) outperforms traditional summarization and density-based approaches, with improvements ranging from 16.7% to 29.6% in keyword definition tasks. It generates summaries that outperform in 5 out of the 7 tested metrics while maintaining the logical integrity of bibliographic references. Our results illuminate the shifts in focus within Information Assurance across decades, revealing key breakthroughs and forecasting emerging areas of significance.https://doi.org/10.1038/s41598-025-94551-8Information assuranceCybersecurity trendsSystematic topic reviewLarge language modelsNatural language processing (NLP)Automatic summarization
spellingShingle	Andrés Leiva-Araos Bady Gana Héctor Allende-Cid José García Manob Jyoti Saikia Large scale summarization using ensemble prompts and in context learning approaches Scientific Reports Information assurance Cybersecurity trends Systematic topic review Large language models Natural language processing (NLP) Automatic summarization
title	Large scale summarization using ensemble prompts and in context learning approaches
title_full	Large scale summarization using ensemble prompts and in context learning approaches
title_fullStr	Large scale summarization using ensemble prompts and in context learning approaches
title_full_unstemmed	Large scale summarization using ensemble prompts and in context learning approaches
title_short	Large scale summarization using ensemble prompts and in context learning approaches
title_sort	large scale summarization using ensemble prompts and in context learning approaches
topic	Information assurance Cybersecurity trends Systematic topic review Large language models Natural language processing (NLP) Automatic summarization
url	https://doi.org/10.1038/s41598-025-94551-8
work_keys_str_mv	AT andresleivaaraos largescalesummarizationusingensemblepromptsandincontextlearningapproaches AT badygana largescalesummarizationusingensemblepromptsandincontextlearningapproaches AT hectorallendecid largescalesummarizationusingensemblepromptsandincontextlearningapproaches AT josegarcia largescalesummarizationusingensemblepromptsandincontextlearningapproaches AT manobjyotisaikia largescalesummarizationusingensemblepromptsandincontextlearningapproaches

Large scale summarization using ensemble prompts and in context learning approaches

Similar Items