Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs

Graph-to-text generation for specialized tasks, such as scientific abstract generation, is challenging due to the limited availability of structured knowledge graphs and the need to balance semantic accuracy with paragraph coherence. This motivates our proposal of an Extraction-Augmented Scientific...

Full description

Saved in:

Bibliographic Details
Main Authors:	Haotong Wang, Yves Lepage
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Extraction-augmented generation scientific abstract knowledge graphs datasets
Online Access:	https://ieeexplore.ieee.org/document/10929048/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849392081098768384
author	Haotong Wang Yves Lepage
author_facet	Haotong Wang Yves Lepage
author_sort	Haotong Wang
collection	DOAJ
description	Graph-to-text generation for specialized tasks, such as scientific abstract generation, is challenging due to the limited availability of structured knowledge graphs and the need to balance semantic accuracy with paragraph coherence. This motivates our proposal of an Extraction-Augmented Scientific Abstract Generation (EASAG) model which includes the processes of self-extraction, graph fusion, and abstract generation. The model performs self-determination of entities, followed by fine-grained extraction for each entity, predicting the target entity by specifying relations to construct semantic triples. The accumulated triples are then represented more logically through knowledge fusion using two proposed methods: Multi-hop Longest Subchain (MLS) and Label Ordering (LO). The former focuses on uncovering the core logical chain of the content, while the latter functionally segments sequences within the knowledge graph. Experimental results indicate that our model improves the quality of generated scientific abstracts through knowledge richness and the integration of discrete information. The two knowledge fusion methods are designed to enhance specific aspects, with one focusing on semantic accuracy and the other on maintaining paragraph structure integrity. Through fine-grained extraction, we reconstructed the Abstract Generation Dataset (AGENDA) and the newly developed ACL Abstract Graph Dataset (ACL-AGD) containing the latest Natural Language Processing (NLP) research, both datasets composed of graph-abstract pairs. Analysis reveals that these datasets exhibit richer relations, enhanced graph connectivity, and a more uniform distribution of relations.
format	Article
id	doaj-art-c7b4b3d7c18f417d8780d9a1e78eb7af
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-c7b4b3d7c18f417d8780d9a1e78eb7af2025-08-20T03:40:51ZengIEEEIEEE Access2169-35362025-01-0113487754879110.1109/ACCESS.2025.355175610929048Extraction-Augmented Generation of Scientific Abstracts Using Knowledge GraphsHaotong Wang0https://orcid.org/0000-0003-3209-5932Yves Lepage1Graduate School of Information, Production and Systems, Waseda University, Kitakyushu, JapanGraduate School of Information, Production and Systems, Waseda University, Kitakyushu, JapanGraph-to-text generation for specialized tasks, such as scientific abstract generation, is challenging due to the limited availability of structured knowledge graphs and the need to balance semantic accuracy with paragraph coherence. This motivates our proposal of an Extraction-Augmented Scientific Abstract Generation (EASAG) model which includes the processes of self-extraction, graph fusion, and abstract generation. The model performs self-determination of entities, followed by fine-grained extraction for each entity, predicting the target entity by specifying relations to construct semantic triples. The accumulated triples are then represented more logically through knowledge fusion using two proposed methods: Multi-hop Longest Subchain (MLS) and Label Ordering (LO). The former focuses on uncovering the core logical chain of the content, while the latter functionally segments sequences within the knowledge graph. Experimental results indicate that our model improves the quality of generated scientific abstracts through knowledge richness and the integration of discrete information. The two knowledge fusion methods are designed to enhance specific aspects, with one focusing on semantic accuracy and the other on maintaining paragraph structure integrity. Through fine-grained extraction, we reconstructed the Abstract Generation Dataset (AGENDA) and the newly developed ACL Abstract Graph Dataset (ACL-AGD) containing the latest Natural Language Processing (NLP) research, both datasets composed of graph-abstract pairs. Analysis reveals that these datasets exhibit richer relations, enhanced graph connectivity, and a more uniform distribution of relations.https://ieeexplore.ieee.org/document/10929048/Extraction-augmented generationscientific abstractknowledge graphsdatasets
spellingShingle	Haotong Wang Yves Lepage Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs IEEE Access Extraction-augmented generation scientific abstract knowledge graphs datasets
title	Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs
title_full	Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs
title_fullStr	Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs
title_full_unstemmed	Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs
title_short	Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs
title_sort	extraction augmented generation of scientific abstracts using knowledge graphs
topic	Extraction-augmented generation scientific abstract knowledge graphs datasets
url	https://ieeexplore.ieee.org/document/10929048/
work_keys_str_mv	AT haotongwang extractionaugmentedgenerationofscientificabstractsusingknowledgegraphs AT yveslepage extractionaugmentedgenerationofscientificabstractsusingknowledgegraphs

Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs

Similar Items