FrameProver: Leveraging Formal Frameworks in Building Proofs for Enhanced Inference

Haojia Shan, Beibei Xiong, Niuniu Qi, Zhengfeng Yang*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Formal theorem proving with large language models (LLMs) remains a significant challenge due to the inherent complexity of structured reasoning and the scarcity of formal proof data. While recent efforts have focused on addressing the limited availability of formal proof data through techniques such as data augmentation and automatic formalization, these approaches continue to face substantial challenges, leaving the problem of data scarcity unresolved. Therefore, to better leverage existing formal data, we propose FrameProver, a whole-proof generation model that first constructs an intermediate formal framework to guide the generation of complete proofs. To support this, we analyze proof patterns in Lean 4 and design formal framework templates, which are used to generate training data with the help of our tool, REPL4Framework. Based on the Lean-workbook-proofs dataset, we construct a balanced dataset of 21.3k examples, including 10.6k examples with formal frameworks. FrameProver is fine-tuned from DeepSeek-Prover-V1.5-Base on this dataset and achieves 40.2% accuracy (pass@128, from 37.3% without the framework) on the MiniF2F-Test benchmark.

Original languageEnglish
Title of host publicationAdvanced Intelligent Computing Technology and Applications - 21st International Conference, ICIC 2025, Proceedings
EditorsDe-Shuang Huang, Chuanlei Zhang, Qinhu Zhang, Yijie Pan
PublisherSpringer Science and Business Media Deutschland GmbH
Pages188-200
Number of pages13
ISBN (Print)9789819698837
DOIs
StatePublished - 2025
Event21st International Conference on Intelligent Computing, ICIC 2025 - Ningbo, China
Duration: 26 Jul 202529 Jul 2025

Publication series

NameLecture Notes in Computer Science
Volume15850 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st International Conference on Intelligent Computing, ICIC 2025
Country/TerritoryChina
CityNingbo
Period26/07/2529/07/25

Keywords

  • Formal Theorem Proving
  • Large Language Model
  • Mathematical Reasoning

Fingerprint

Dive into the research topics of 'FrameProver: Leveraging Formal Frameworks in Building Proofs for Enhanced Inference'. Together they form a unique fingerprint.

Cite this