TY - JOUR
T1 - NL2ACSL
T2 - Interactively translating natural language to ANSI C specification language with large language models
AU - Su, Wenxian
AU - Wu, Xi
AU - Zhao, Yongxin
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2026/2/28
Y1 - 2026/2/28
N2 - The ANSI/ISO C Specification Language (ACSL) is widely used to describe the behavioral properties of C programs in formal verification. However, manually crafting formal program specifications requires substantial expertise and effort, which makes it time-consuming and demanding. To reduce the burden on software developers, recent advances in Large Language Models (LLMs) have inspired studies on automated specification generation. Although LLMs demonstrate strong capabilities in code understanding and generation, existing methods are limited owing to the lack of ACSL-annotated data and the need for expert intervention, making completetly automated and correct specification generation difficult. To address these challenges, this study proposes NL2ACSL, an LLM-based multi-agent framework that automatically translates natural language descriptions into formal ACSL function contracts. In NL2ACSL, the natural language description is first analyzed to identify the intended functional requirements. These requirements are then aligned with an ACSL knowledge base using retrieval-augmented generation (RAG), which provides grammar patterns and logic templates for specification construction. The retrieved context is passed to a specification generation agent that uses chain-of-thought (CoT) prompting to generate candidate ACSL annotations. Finally, a dedicated evaluation agent trained using reinforcement learning selects the specification that best preserves semantic intent while adhering to ACSL syntactic rules. Experimental results show that NL2ACSL achieves verification success rates of 74.4% and 69.5% with GPT-4o and DeepSeek-V3, respectively, outperforming existing LLM-based methods and demonstrating improved accuracy and applicability in formal software verification.
AB - The ANSI/ISO C Specification Language (ACSL) is widely used to describe the behavioral properties of C programs in formal verification. However, manually crafting formal program specifications requires substantial expertise and effort, which makes it time-consuming and demanding. To reduce the burden on software developers, recent advances in Large Language Models (LLMs) have inspired studies on automated specification generation. Although LLMs demonstrate strong capabilities in code understanding and generation, existing methods are limited owing to the lack of ACSL-annotated data and the need for expert intervention, making completetly automated and correct specification generation difficult. To address these challenges, this study proposes NL2ACSL, an LLM-based multi-agent framework that automatically translates natural language descriptions into formal ACSL function contracts. In NL2ACSL, the natural language description is first analyzed to identify the intended functional requirements. These requirements are then aligned with an ACSL knowledge base using retrieval-augmented generation (RAG), which provides grammar patterns and logic templates for specification construction. The retrieved context is passed to a specification generation agent that uses chain-of-thought (CoT) prompting to generate candidate ACSL annotations. Finally, a dedicated evaluation agent trained using reinforcement learning selects the specification that best preserves semantic intent while adhering to ACSL syntactic rules. Experimental results show that NL2ACSL achieves verification success rates of 74.4% and 69.5% with GPT-4o and DeepSeek-V3, respectively, outperforming existing LLM-based methods and demonstrating improved accuracy and applicability in formal software verification.
KW - Large language model
KW - Program verification
KW - Reinforcement learning
KW - Specification inference
UR - https://www.scopus.com/pages/publications/105026129525
U2 - 10.1016/j.knosys.2025.115177
DO - 10.1016/j.knosys.2025.115177
M3 - 文章
AN - SCOPUS:105026129525
SN - 0950-7051
VL - 335
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 115177
ER -