Chat2Query: A Zero-Shot Automatic Exploratory Data Analysis System with Large Language Models

Jun Peng Zhu, Peng Cai, Boyan Niu, Zheming Ni, Kai Xu, Jiajun Huang, Jianwei Wan, Shengbo Ma, Bing Wang, Donghui Zhang, Liu Tang, Qi Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

Data analysts often encounter two primary challenges while conducting exploratory data analysis by SQL: (1) the need to skillfully craft SQL queries, and (2) the requirement to generate suitable visualizations that enhance the interpretation of query results. The emergence of large language models (LLMs) has inaugurated a paradigm shift in text-to-SQL and data-to-chart. This paper presents Chat2Query, an LLM -empowered zero-shot automatic exploration data analysis system. Firstly, Chat2Query provides a user-friendly interface that allows users to employ natural languages to interact with the database directly. Secondly, Chat2Query offers an LLM -empowered text-to-SQL generator, SQL rewriter, SQL formatter, and data-to-chart generator. Thirdly, Chat2Query is uniquely distinguished by its underlying incorporation of the TiDB Serverless, fostering superior elasticity and scalability. This strategic integration empowers Chat2Query with the capability to seamlessly adapt to change workloads, aligning with the evolving demands of the user. We have implemented and deployed Chat2Query in the production environment, and demonstrate its usability and efficiency in three representative real-world scenarios.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
PublisherIEEE Computer Society
Pages5429-5432
Number of pages4
ISBN (Electronic)9798350317152
DOIs
StatePublished - 2024
Event40th IEEE International Conference on Data Engineering, ICDE 2024 - Utrecht, Netherlands
Duration: 13 May 202417 May 2024

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627
ISSN (Electronic)2375-0286

Conference

Conference40th IEEE International Conference on Data Engineering, ICDE 2024
Country/TerritoryNetherlands
CityUtrecht
Period13/05/2417/05/24

Keywords

  • EDA
  • LLMs
  • Prompting
  • Text-to-SQL
  • Usability

Fingerprint

Dive into the research topics of 'Chat2Query: A Zero-Shot Automatic Exploratory Data Analysis System with Large Language Models'. Together they form a unique fingerprint.

Cite this