Improving Data-to-Text Generation via Preserving High-Frequency Phrases and Fact-Checking
Transforming numerical data into natural language descriptions (data-to-text) requires presenting the data in the correct context, supplementing plausible details, and creating an overall coherent and non-conflicting narrative. In this work, we propose a generate-extract-correct pipeline for the tas...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Accademia University Press
2021-12-01
|
| Series: | IJCoL |
| Online Access: | https://journals.openedition.org/ijcol/909 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850262751240257536 |
|---|---|
| author | Ethan Joseph Julian Lioanag Mei Si |
| author_facet | Ethan Joseph Julian Lioanag Mei Si |
| author_sort | Ethan Joseph |
| collection | DOAJ |
| description | Transforming numerical data into natural language descriptions (data-to-text) requires presenting the data in the correct context, supplementing plausible details, and creating an overall coherent and non-conflicting narrative. In this work, we propose a generate-extract-correct pipeline for the task. We use transfer learning with an auxiliary task of keeping high-frequency word sequences from the training data for text generation. We then apply information extraction to the generated text to check its accuracy, followed by correction, and thus ensure the coherence of the generated narrative. We demonstrate the effectiveness of this approach with both objective and subjective evaluations. Using an empirical evaluation, we show that people rated our system’s outputs similarly to human-written text regarding its coherence, conciseness, and grammar. |
| format | Article |
| id | doaj-art-6fcfa0c0628c4281bd95766bd6b2a8cc |
| institution | OA Journals |
| issn | 2499-4553 |
| language | English |
| publishDate | 2021-12-01 |
| publisher | Accademia University Press |
| record_format | Article |
| series | IJCoL |
| spelling | doaj-art-6fcfa0c0628c4281bd95766bd6b2a8cc2025-08-20T01:55:08ZengAccademia University PressIJCoL2499-45532021-12-01722324410.4000/ijcol.909Improving Data-to-Text Generation via Preserving High-Frequency Phrases and Fact-CheckingEthan JosephJulian LioanagMei SiTransforming numerical data into natural language descriptions (data-to-text) requires presenting the data in the correct context, supplementing plausible details, and creating an overall coherent and non-conflicting narrative. In this work, we propose a generate-extract-correct pipeline for the task. We use transfer learning with an auxiliary task of keeping high-frequency word sequences from the training data for text generation. We then apply information extraction to the generated text to check its accuracy, followed by correction, and thus ensure the coherence of the generated narrative. We demonstrate the effectiveness of this approach with both objective and subjective evaluations. Using an empirical evaluation, we show that people rated our system’s outputs similarly to human-written text regarding its coherence, conciseness, and grammar.https://journals.openedition.org/ijcol/909 |
| spellingShingle | Ethan Joseph Julian Lioanag Mei Si Improving Data-to-Text Generation via Preserving High-Frequency Phrases and Fact-Checking IJCoL |
| title | Improving Data-to-Text Generation via Preserving High-Frequency Phrases and Fact-Checking |
| title_full | Improving Data-to-Text Generation via Preserving High-Frequency Phrases and Fact-Checking |
| title_fullStr | Improving Data-to-Text Generation via Preserving High-Frequency Phrases and Fact-Checking |
| title_full_unstemmed | Improving Data-to-Text Generation via Preserving High-Frequency Phrases and Fact-Checking |
| title_short | Improving Data-to-Text Generation via Preserving High-Frequency Phrases and Fact-Checking |
| title_sort | improving data to text generation via preserving high frequency phrases and fact checking |
| url | https://journals.openedition.org/ijcol/909 |
| work_keys_str_mv | AT ethanjoseph improvingdatatotextgenerationviapreservinghighfrequencyphrasesandfactchecking AT julianlioanag improvingdatatotextgenerationviapreservinghighfrequencyphrasesandfactchecking AT meisi improvingdatatotextgenerationviapreservinghighfrequencyphrasesandfactchecking |