Based on BERT-GPT-GNN converged architecture: intelligent generation engine for complex SQL queries in business intelligence

Abstract Modern enterprises increasingly rely on data-driven decision-making, but traditional SQL (Structured Query Language) queries require professional knowledge, limiting their use by non-technical personnel. With the advancement of natural language processing technology, especially the applicat...

Full description

Saved in:
Bibliographic Details
Main Authors: Shiwei Chu, Jie Liu
Format: Article
Language:English
Published: Springer 2025-07-01
Series:Discover Artificial Intelligence
Subjects:
Online Access:https://doi.org/10.1007/s44163-025-00381-y
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Modern enterprises increasingly rely on data-driven decision-making, but traditional SQL (Structured Query Language) queries require professional knowledge, limiting their use by non-technical personnel. With the advancement of natural language processing technology, especially the application of deep learning generative models, text2SQL (Text to SQL) conversion has become possible. RAG (Retrieval-augmented Generation) further improves the accuracy and relevance of answers by combining the advantages of retrieval and generation. This article aims to develop a text2SQL business intelligence system based on RAG, which enables enterprise users to seamlessly extract actionable insights from complex databases via intuitive natural language queries, streamline data retrieval processes, lower technical barriers for non-specialist users, and achieve state-of-the-art performance in SQL query generation for complex tasks. Using BERT (Bidirectional Encoder Representations from Transformers) model for vectorized retrieval and GPT-4 (Generative Pre-trained Transformer 4) pre-trained model for generation, combined with GNN (Graph Neural Network) modeling database structure, the ability to generate complex queries is improved, and the semantic understanding ability of the model is iteratively optimized through user interaction and feedback mechanism. The experimental results show that BERT + GPT-4 + GNN performs excellently in matching accuracy for multi-table joins and nested queries. The query matching accuracy of multi-table joins of BERT + GPT-4 + GNN is 52.3% and 55.1%, respectively, when the beam width is 1 and 10. The query matching accuracy of nested queries with multi-table joins of BERT + GPT-4 + GNN is 60.2% and 61.9%, respectively, when the beam width is 1 and 10. The user satisfaction score of BERT + GPT-4 + GNN is the highest, which verifies its superiority in practical applications. The text2SQL business intelligence system based on RAG proposed in this article significantly improves the ability to process complex queries and reduces data access barriers, thereby providing enterprise users with more convenient and efficient database query tools.
ISSN:2731-0809