A Unified Framework for Knowledge-Intensive Numerical Reasoning over Financial Document

Long Yin, Kai Yin*, Hui Zhao*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Numerical reasoning in financial document analysis constitutes a fundamental challenge for corporate financial report understanding, drawing increasing attention in both academia and industry. However, current approaches suffer from semantic misalignment in multi-hierarchical tables, reasoning disruptions from insufficient integration of financial metric formulas, and sensitivity of the model’s reasoning results to the order of evidence. To address these challenges, we propose a unified framework for knowledge-intensive numerical reasoning over financial documents. Within this framework, we introduce a Triplets-based Multi-Stage Tabular-Textual Hybrid Evidence Retrieval (THER) method to resolve semantic misalignment by converting multi-hierarchical tables into triplet representations. Furthermore, we propose the Fine Grained Knowledge Injected Chain-of-Thought (FGKI-CoT) method to enhance numerical reasoning by explicitly integrating financial conceptual formulas into the reasoning path. Building on FGKI-CoT, we introduce the Evidence Order Sampling based Self-Consistency (EOSC) method, which mitigates the model’s sensitivity to evidence order by altering the input evidence sequence. Experiments demonstrate that our framework enables a 1.5B-parameter language model to outperform GPT-3.5-turbo by 3.95% in numerical reasoning on the Multihiertt Dev dataset. Additionally, we conduct supplementary experiments to further explore the impact of table representations and reasoning step expressions on the numerical reasoning performance of language models.

Original languageEnglish
Title of host publicationDocument Analysis and Recognition – ICDAR 2025 - 19th International Conference, Proceedings
EditorsXu-Cheng Yin, Dimosthenis Karatzas, Daniel Lopresti
PublisherSpringer Science and Business Media Deutschland GmbH
Pages38-59
Number of pages22
ISBN (Print)9783032046260
DOIs
StatePublished - 2026
Event19th International Conference on Document Analysis and Recognition, ICDAR 2025 - Wuhan, China
Duration: 16 Sep 202521 Sep 2025

Publication series

NameLecture Notes in Computer Science
Volume16026 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Document Analysis and Recognition, ICDAR 2025
Country/TerritoryChina
CityWuhan
Period16/09/2521/09/25

Keywords

  • Chain of Thought
  • Financial Document Understanding
  • Numerical Reasoning

Fingerprint

Dive into the research topics of 'A Unified Framework for Knowledge-Intensive Numerical Reasoning over Financial Document'. Together they form a unique fingerprint.

Cite this