Charting the Uncharted: Building and Analyzing a Multifaceted Chart Question Answering Dataset for Complex Logical Reasoning Process

  • Anran Wu*
  • , Shuwen Yang
  • , Yujia Xia
  • , Xingjiao Wu
  • , Tianlong Ma
  • , Liang He
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Charts, as a vital part of visualization language, are omnipresent in real-world. Understanding charts is crucial for unveiling implicit data insights. The evolution of large-scale models has marked significant milestones in chart comprehension. However, comprehending multiple charts jointly remains challenging due to the complexities of multi-chart reasoning and the intricate dataset construction involving multiple charts. In this study, we introduce DGE, a sophisticated logic-based multi-chart question-answering dataset generation engine that, with only simple data input, generates diverse joint charts and questions with complex logic. It employs logical templates to guide question generation, ensuring excellent scalability. Leveraging the DGE engine, we propose MCQA, the inaugural large-scale dataset for joint reasoning question-answering involving multiple charts, which includes 22,860 chart pairs and 100,331 complex questions, each annotated with an inference process. Finally, we evaluate several baselines on the MCQA dataset, establishing a research foundation for the chart question answering community. The MCQA dataset is available at github (https://github.com/ICALK-CVU/MCQA).

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision - 7th Chinese Conference, PRCV 2024, Proceedings
EditorsZhouchen Lin, Hongbin Zha, Ming-Ming Cheng, Ran He, Cheng-Lin Liu, Kurban Ubul, Wushouer Silamu, Jie Zhou
PublisherSpringer Science and Business Media Deutschland GmbH
Pages18-33
Number of pages16
ISBN (Print)9789819786190
DOIs
StatePublished - 2025
Event7th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2024 - Urumqi, China
Duration: 18 Oct 202420 Oct 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15035 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2024
Country/TerritoryChina
CityUrumqi
Period18/10/2420/10/24

Keywords

  • Chart Question Answering
  • Dataset Generation
  • Multiple Chart Reasoning

Fingerprint

Dive into the research topics of 'Charting the Uncharted: Building and Analyzing a Multifaceted Chart Question Answering Dataset for Complex Logical Reasoning Process'. Together they form a unique fingerprint.

Cite this