RBDQ: A Reliable LLM-based Text-to-SQL System for Business Data Queries

Fenglin Bi, Dongdong Cao, Zhiyu Wang, Yang Chen, Fangliang Zhao, Tao Hu, Zhi Li, Yanbin Zhang*, Wei Wang*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Using large language models (LLMs) to convert natural language (NL) into SQL simplifies data access for users by allowing them to use everyday language. However, business departments often distrust LLM-based text-to-SQL systems due to the probabilistic nature of SQL generation, which can result in incorrect but executable SQL queries caused by model hallucinations. This leads to significant concerns regarding the accuracy and reliability of the queried data. In this paper, we present RBDQ, a novel LLM-based text-to-SQL system designed to address the unique challenges of business data queries. RBDQ innovatively introduces the Hierarchical Metrics Query Method and integrates advanced Retrieval-Augmented Generation (RAG) methods along with a self-reflection mechanism to tackle these challenges. RBDQ effectively meets the requirements of business metric queries in real-world scenarios. Currently implemented in the Quality Assurance department at ByteDance, RBDQ has significantly improved operational efficiency and query flexibility. Our experiments demonstrate the system's effectiveness, achieving an Execution Accuracy of 96.20%.

Original languageEnglish
Title of host publicationWWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025
PublisherAssociation for Computing Machinery, Inc
Pages95-103
Number of pages9
ISBN (Electronic)9798400713316
DOIs
StatePublished - 23 May 2025
Event34th ACM Web Conference, WWW Companion 2025 - Sydney, Australia
Duration: 28 Apr 20252 May 2025

Publication series

NameWWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025

Conference

Conference34th ACM Web Conference, WWW Companion 2025
Country/TerritoryAustralia
CitySydney
Period28/04/252/05/25

Keywords

  • Business Data Queries
  • Large Language Models
  • Retrieval-Augmented Generation
  • Text-to-SQL

Fingerprint

Dive into the research topics of 'RBDQ: A Reliable LLM-based Text-to-SQL System for Business Data Queries'. Together they form a unique fingerprint.

Cite this