跳到主要导航 跳到搜索 跳到主要内容

FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding OptimizatioN

  • Zeyuan Li
  • , Yangfan He
  • , Lewei He*
  • , Jianhui Wang
  • , Tianyu Shi
  • , Bin Lei
  • , Yuchen Li
  • , Qiuwu Chen
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Recently, large language models (LLMs) have achieved significant progress in automated code generation. Despite their strong instruction-following capabilities, these models frequently struggled to align with user intent in the coding scenario. In particular, they were hampered by datasets that lacked diversity and failed to address specialized tasks or edge cases. Furthermore, challenges in supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) led to failures in generating precise, human-intent-aligned code. To tackle these challenges and improve the code generation performance for automated programming systems, we propose Feedback-driven Adaptive Long/short-term memory reinforced Coding OptimizatioN (i.e., FALCON). FALCON leverages long-term memory to retain and apply learned knowledge, short-term memory to incorporate immediate feedback, and meta-reinforcement learning with feedback rewards to address global-local bi-level optimization and enhance adaptability across diverse code generation tasks. Extensive experiments show that FALCON achieves state-of-the-art performance, outperforming other reinforcement learning methods by over 4.5% on MBPP and 6.1% on Humaneval, with the code publicly available. https://anonymous.4open.science/r/FALCON-3B64/README.md.

源语言英语
主期刊名2025 IEEE International Conference on Multimedia and Expo
主期刊副标题Journey to the Center of Machine Imagination, ICME 2025 - Conference Proceedings
出版商IEEE Computer Society
ISBN(电子版)9798331594954
DOI
出版状态已出版 - 2025
已对外发布
活动2025 IEEE International Conference on Multimedia and Expo, ICME 2025 - Nantes, 法国
期限: 30 6月 20254 7月 2025

出版系列

姓名Proceedings - IEEE International Conference on Multimedia and Expo
ISSN(印刷版)1945-7871
ISSN(电子版)1945-788X

会议

会议2025 IEEE International Conference on Multimedia and Expo, ICME 2025
国家/地区法国
Nantes
时期30/06/254/07/25

指纹

探究 'FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding OptimizatioN' 的科研主题。它们共同构成独一无二的指纹。

引用此