Generate Text-to-SQL Queries Based on Sketch Filling

The Text-to-SQL task has significant application prospects in automating relational database query interfaces. It can reduce user learning costs and improve data query efficiency. However, in Text-to-SQL tasks, there is often a phenomenon of semantic gaps and insufficient information due to the abse...

Full description

Saved in:
Bibliographic Details
Main Authors: Yinpei Fu, Songtao Ye, Hongjie Fan
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10711192/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850114048600834048
author Yinpei Fu
Songtao Ye
Hongjie Fan
author_facet Yinpei Fu
Songtao Ye
Hongjie Fan
author_sort Yinpei Fu
collection DOAJ
description The Text-to-SQL task has significant application prospects in automating relational database query interfaces. It can reduce user learning costs and improve data query efficiency. However, in Text-to-SQL tasks, there is often a phenomenon of semantic gaps and insufficient information due to the absence of columns or condition values required by SQL statements explicitly mentioned in the natural language queries. In this paper, a deep learning approach based on sketch filling is proposed to address the issues of insufficient information and semantic gaps in natural language queries. To tackle the problem of insufficient information, the model preprocesses the natural language queries, marks the named entities associated with the database table schema and content, and augments the data by randomly swapping entities. This augmentation strengthens the training of common natural language query templates, improving the model’s accuracy in predicting results for typical questions. To address the issue of semantic gaps, the model introduces the missing table content from the natural language queries during semantic encoding. An attention mechanism is used to enhance the representation of table content, enabling the Text-to-SQL model to better understand queries and improve performance. The results demonstrate that the proposed model achieves better results on two benchmarks. Regarding the content augmentation methods proposed, ablation experiments show that both the data augmentation and table content enhancement schemes can improve the model’s performance.
format Article
id doaj-art-cc31d32e0a8b4416b4b18a7d42cead7c
institution OA Journals
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-cc31d32e0a8b4416b4b18a7d42cead7c2025-08-20T02:36:59ZengIEEEIEEE Access2169-35362024-01-011215239215240310.1109/ACCESS.2024.347692710711192Generate Text-to-SQL Queries Based on Sketch FillingYinpei Fu0https://orcid.org/0009-0000-5441-6341Songtao Ye1https://orcid.org/0000-0003-1728-9878Hongjie Fan2https://orcid.org/0000-0002-4872-8557School of Computer Science, Xiangtan University, Xiangtan, Hunan, ChinaSchool of Computer Science, Xiangtan University, Xiangtan, Hunan, ChinaDepartment of Science and Technology Teaching, China University of Political Science and Law, Beijing, ChinaThe Text-to-SQL task has significant application prospects in automating relational database query interfaces. It can reduce user learning costs and improve data query efficiency. However, in Text-to-SQL tasks, there is often a phenomenon of semantic gaps and insufficient information due to the absence of columns or condition values required by SQL statements explicitly mentioned in the natural language queries. In this paper, a deep learning approach based on sketch filling is proposed to address the issues of insufficient information and semantic gaps in natural language queries. To tackle the problem of insufficient information, the model preprocesses the natural language queries, marks the named entities associated with the database table schema and content, and augments the data by randomly swapping entities. This augmentation strengthens the training of common natural language query templates, improving the model’s accuracy in predicting results for typical questions. To address the issue of semantic gaps, the model introduces the missing table content from the natural language queries during semantic encoding. An attention mechanism is used to enhance the representation of table content, enabling the Text-to-SQL model to better understand queries and improve performance. The results demonstrate that the proposed model achieves better results on two benchmarks. Regarding the content augmentation methods proposed, ablation experiments show that both the data augmentation and table content enhancement schemes can improve the model’s performance.https://ieeexplore.ieee.org/document/10711192/Text-to-SQLsketch fillingsematic parsingdata augmentationattention mechanism
spellingShingle Yinpei Fu
Songtao Ye
Hongjie Fan
Generate Text-to-SQL Queries Based on Sketch Filling
IEEE Access
Text-to-SQL
sketch filling
sematic parsing
data augmentation
attention mechanism
title Generate Text-to-SQL Queries Based on Sketch Filling
title_full Generate Text-to-SQL Queries Based on Sketch Filling
title_fullStr Generate Text-to-SQL Queries Based on Sketch Filling
title_full_unstemmed Generate Text-to-SQL Queries Based on Sketch Filling
title_short Generate Text-to-SQL Queries Based on Sketch Filling
title_sort generate text to sql queries based on sketch filling
topic Text-to-SQL
sketch filling
sematic parsing
data augmentation
attention mechanism
url https://ieeexplore.ieee.org/document/10711192/
work_keys_str_mv AT yinpeifu generatetexttosqlqueriesbasedonsketchfilling
AT songtaoye generatetexttosqlqueriesbasedonsketchfilling
AT hongjiefan generatetexttosqlqueriesbasedonsketchfilling