TY - GEN
T1 - A Unified Framework for Knowledge-Intensive Numerical Reasoning over Financial Document
AU - Yin, Long
AU - Yin, Kai
AU - Zhao, Hui
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Numerical reasoning in financial document analysis constitutes a fundamental challenge for corporate financial report understanding, drawing increasing attention in both academia and industry. However, current approaches suffer from semantic misalignment in multi-hierarchical tables, reasoning disruptions from insufficient integration of financial metric formulas, and sensitivity of the model’s reasoning results to the order of evidence. To address these challenges, we propose a unified framework for knowledge-intensive numerical reasoning over financial documents. Within this framework, we introduce a Triplets-based Multi-Stage Tabular-Textual Hybrid Evidence Retrieval (THER) method to resolve semantic misalignment by converting multi-hierarchical tables into triplet representations. Furthermore, we propose the Fine Grained Knowledge Injected Chain-of-Thought (FGKI-CoT) method to enhance numerical reasoning by explicitly integrating financial conceptual formulas into the reasoning path. Building on FGKI-CoT, we introduce the Evidence Order Sampling based Self-Consistency (EOSC) method, which mitigates the model’s sensitivity to evidence order by altering the input evidence sequence. Experiments demonstrate that our framework enables a 1.5B-parameter language model to outperform GPT-3.5-turbo by 3.95% in numerical reasoning on the Multihiertt Dev dataset. Additionally, we conduct supplementary experiments to further explore the impact of table representations and reasoning step expressions on the numerical reasoning performance of language models.
AB - Numerical reasoning in financial document analysis constitutes a fundamental challenge for corporate financial report understanding, drawing increasing attention in both academia and industry. However, current approaches suffer from semantic misalignment in multi-hierarchical tables, reasoning disruptions from insufficient integration of financial metric formulas, and sensitivity of the model’s reasoning results to the order of evidence. To address these challenges, we propose a unified framework for knowledge-intensive numerical reasoning over financial documents. Within this framework, we introduce a Triplets-based Multi-Stage Tabular-Textual Hybrid Evidence Retrieval (THER) method to resolve semantic misalignment by converting multi-hierarchical tables into triplet representations. Furthermore, we propose the Fine Grained Knowledge Injected Chain-of-Thought (FGKI-CoT) method to enhance numerical reasoning by explicitly integrating financial conceptual formulas into the reasoning path. Building on FGKI-CoT, we introduce the Evidence Order Sampling based Self-Consistency (EOSC) method, which mitigates the model’s sensitivity to evidence order by altering the input evidence sequence. Experiments demonstrate that our framework enables a 1.5B-parameter language model to outperform GPT-3.5-turbo by 3.95% in numerical reasoning on the Multihiertt Dev dataset. Additionally, we conduct supplementary experiments to further explore the impact of table representations and reasoning step expressions on the numerical reasoning performance of language models.
KW - Chain of Thought
KW - Financial Document Understanding
KW - Numerical Reasoning
UR - https://www.scopus.com/pages/publications/105017377904
U2 - 10.1007/978-3-032-04627-7_3
DO - 10.1007/978-3-032-04627-7_3
M3 - 会议稿件
AN - SCOPUS:105017377904
SN - 9783032046260
T3 - Lecture Notes in Computer Science
SP - 38
EP - 59
BT - Document Analysis and Recognition – ICDAR 2025 - 19th International Conference, Proceedings
A2 - Yin, Xu-Cheng
A2 - Karatzas, Dimosthenis
A2 - Lopresti, Daniel
PB - Springer Science and Business Media Deutschland GmbH
T2 - 19th International Conference on Document Analysis and Recognition, ICDAR 2025
Y2 - 16 September 2025 through 21 September 2025
ER -