Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs
Graph-to-text generation for specialized tasks, such as scientific abstract generation, is challenging due to the limited availability of structured knowledge graphs and the need to balance semantic accuracy with paragraph coherence. This motivates our proposal of an Extraction-Augmented Scientific...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10929048/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849392081098768384 |
|---|---|
| author | Haotong Wang Yves Lepage |
| author_facet | Haotong Wang Yves Lepage |
| author_sort | Haotong Wang |
| collection | DOAJ |
| description | Graph-to-text generation for specialized tasks, such as scientific abstract generation, is challenging due to the limited availability of structured knowledge graphs and the need to balance semantic accuracy with paragraph coherence. This motivates our proposal of an Extraction-Augmented Scientific Abstract Generation (EASAG) model which includes the processes of self-extraction, graph fusion, and abstract generation. The model performs self-determination of entities, followed by fine-grained extraction for each entity, predicting the target entity by specifying relations to construct semantic triples. The accumulated triples are then represented more logically through knowledge fusion using two proposed methods: Multi-hop Longest Subchain (MLS) and Label Ordering (LO). The former focuses on uncovering the core logical chain of the content, while the latter functionally segments sequences within the knowledge graph. Experimental results indicate that our model improves the quality of generated scientific abstracts through knowledge richness and the integration of discrete information. The two knowledge fusion methods are designed to enhance specific aspects, with one focusing on semantic accuracy and the other on maintaining paragraph structure integrity. Through fine-grained extraction, we reconstructed the Abstract Generation Dataset (AGENDA) and the newly developed ACL Abstract Graph Dataset (ACL-AGD) containing the latest Natural Language Processing (NLP) research, both datasets composed of graph-abstract pairs. Analysis reveals that these datasets exhibit richer relations, enhanced graph connectivity, and a more uniform distribution of relations. |
| format | Article |
| id | doaj-art-c7b4b3d7c18f417d8780d9a1e78eb7af |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-c7b4b3d7c18f417d8780d9a1e78eb7af2025-08-20T03:40:51ZengIEEEIEEE Access2169-35362025-01-0113487754879110.1109/ACCESS.2025.355175610929048Extraction-Augmented Generation of Scientific Abstracts Using Knowledge GraphsHaotong Wang0https://orcid.org/0000-0003-3209-5932Yves Lepage1Graduate School of Information, Production and Systems, Waseda University, Kitakyushu, JapanGraduate School of Information, Production and Systems, Waseda University, Kitakyushu, JapanGraph-to-text generation for specialized tasks, such as scientific abstract generation, is challenging due to the limited availability of structured knowledge graphs and the need to balance semantic accuracy with paragraph coherence. This motivates our proposal of an Extraction-Augmented Scientific Abstract Generation (EASAG) model which includes the processes of self-extraction, graph fusion, and abstract generation. The model performs self-determination of entities, followed by fine-grained extraction for each entity, predicting the target entity by specifying relations to construct semantic triples. The accumulated triples are then represented more logically through knowledge fusion using two proposed methods: Multi-hop Longest Subchain (MLS) and Label Ordering (LO). The former focuses on uncovering the core logical chain of the content, while the latter functionally segments sequences within the knowledge graph. Experimental results indicate that our model improves the quality of generated scientific abstracts through knowledge richness and the integration of discrete information. The two knowledge fusion methods are designed to enhance specific aspects, with one focusing on semantic accuracy and the other on maintaining paragraph structure integrity. Through fine-grained extraction, we reconstructed the Abstract Generation Dataset (AGENDA) and the newly developed ACL Abstract Graph Dataset (ACL-AGD) containing the latest Natural Language Processing (NLP) research, both datasets composed of graph-abstract pairs. Analysis reveals that these datasets exhibit richer relations, enhanced graph connectivity, and a more uniform distribution of relations.https://ieeexplore.ieee.org/document/10929048/Extraction-augmented generationscientific abstractknowledge graphsdatasets |
| spellingShingle | Haotong Wang Yves Lepage Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs IEEE Access Extraction-augmented generation scientific abstract knowledge graphs datasets |
| title | Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs |
| title_full | Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs |
| title_fullStr | Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs |
| title_full_unstemmed | Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs |
| title_short | Extraction-Augmented Generation of Scientific Abstracts Using Knowledge Graphs |
| title_sort | extraction augmented generation of scientific abstracts using knowledge graphs |
| topic | Extraction-augmented generation scientific abstract knowledge graphs datasets |
| url | https://ieeexplore.ieee.org/document/10929048/ |
| work_keys_str_mv | AT haotongwang extractionaugmentedgenerationofscientificabstractsusingknowledgegraphs AT yveslepage extractionaugmentedgenerationofscientificabstractsusingknowledgegraphs |