TY - GEN
T1 - RBDQ
T2 - 34th ACM Web Conference, WWW Companion 2025
AU - Bi, Fenglin
AU - Cao, Dongdong
AU - Wang, Zhiyu
AU - Chen, Yang
AU - Zhao, Fangliang
AU - Hu, Tao
AU - Li, Zhi
AU - Zhang, Yanbin
AU - Wang, Wei
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2025/5/23
Y1 - 2025/5/23
N2 - Using large language models (LLMs) to convert natural language (NL) into SQL simplifies data access for users by allowing them to use everyday language. However, business departments often distrust LLM-based text-to-SQL systems due to the probabilistic nature of SQL generation, which can result in incorrect but executable SQL queries caused by model hallucinations. This leads to significant concerns regarding the accuracy and reliability of the queried data. In this paper, we present RBDQ, a novel LLM-based text-to-SQL system designed to address the unique challenges of business data queries. RBDQ innovatively introduces the Hierarchical Metrics Query Method and integrates advanced Retrieval-Augmented Generation (RAG) methods along with a self-reflection mechanism to tackle these challenges. RBDQ effectively meets the requirements of business metric queries in real-world scenarios. Currently implemented in the Quality Assurance department at ByteDance, RBDQ has significantly improved operational efficiency and query flexibility. Our experiments demonstrate the system's effectiveness, achieving an Execution Accuracy of 96.20%.
AB - Using large language models (LLMs) to convert natural language (NL) into SQL simplifies data access for users by allowing them to use everyday language. However, business departments often distrust LLM-based text-to-SQL systems due to the probabilistic nature of SQL generation, which can result in incorrect but executable SQL queries caused by model hallucinations. This leads to significant concerns regarding the accuracy and reliability of the queried data. In this paper, we present RBDQ, a novel LLM-based text-to-SQL system designed to address the unique challenges of business data queries. RBDQ innovatively introduces the Hierarchical Metrics Query Method and integrates advanced Retrieval-Augmented Generation (RAG) methods along with a self-reflection mechanism to tackle these challenges. RBDQ effectively meets the requirements of business metric queries in real-world scenarios. Currently implemented in the Quality Assurance department at ByteDance, RBDQ has significantly improved operational efficiency and query flexibility. Our experiments demonstrate the system's effectiveness, achieving an Execution Accuracy of 96.20%.
KW - Business Data Queries
KW - Large Language Models
KW - Retrieval-Augmented Generation
KW - Text-to-SQL
UR - https://www.scopus.com/pages/publications/105009212650
U2 - 10.1145/3701716.3715257
DO - 10.1145/3701716.3715257
M3 - 会议稿件
AN - SCOPUS:105009212650
T3 - WWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025
SP - 95
EP - 103
BT - WWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025
PB - Association for Computing Machinery, Inc
Y2 - 28 April 2025 through 2 May 2025
ER -