Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs

Graph-to-text generation for specialized tasks, such as scientific abstract generation, is challenging due to the limited availability of structured knowledge graphs and the need to balance semantic accuracy with paragraph coherence. This motivates our proposal of an Extraction-Augmented Scientific...

Full description

Saved in:

Bibliographic Details
Main Authors:	Haotong Wang, Yves Lepage
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Extraction-augmented generation scientific abstract knowledge graphs datasets
Online Access:	https://ieeexplore.ieee.org/document/10929048/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Graph-to-text generation for specialized tasks, such as scientific abstract generation, is challenging due to the limited availability of structured knowledge graphs and the need to balance semantic accuracy with paragraph coherence. This motivates our proposal of an Extraction-Augmented Scientific Abstract Generation (EASAG) model which includes the processes of self-extraction, graph fusion, and abstract generation. The model performs self-determination of entities, followed by fine-grained extraction for each entity, predicting the target entity by specifying relations to construct semantic triples. The accumulated triples are then represented more logically through knowledge fusion using two proposed methods: Multi-hop Longest Subchain (MLS) and Label Ordering (LO). The former focuses on uncovering the core logical chain of the content, while the latter functionally segments sequences within the knowledge graph. Experimental results indicate that our model improves the quality of generated scientific abstracts through knowledge richness and the integration of discrete information. The two knowledge fusion methods are designed to enhance specific aspects, with one focusing on semantic accuracy and the other on maintaining paragraph structure integrity. Through fine-grained extraction, we reconstructed the Abstract Generation Dataset (AGENDA) and the newly developed ACL Abstract Graph Dataset (ACL-AGD) containing the latest Natural Language Processing (NLP) research, both datasets composed of graph-abstract pairs. Analysis reveals that these datasets exhibit richer relations, enhanced graph connectivity, and a more uniform distribution of relations.
ISSN:	2169-3536

Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs

Similar Items