FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding OptimizatioN

Zeyuan Li, Yangfan He, Lewei He*, Jianhui Wang, Tianyu Shi, Bin Lei, Yuchen Li, Qiuwu Chen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recently, large language models (LLMs) have achieved significant progress in automated code generation. Despite their strong instruction-following capabilities, these models frequently struggled to align with user intent in the coding scenario. In particular, they were hampered by datasets that lacked diversity and failed to address specialized tasks or edge cases. Furthermore, challenges in supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) led to failures in generating precise, human-intent-aligned code. To tackle these challenges and improve the code generation performance for automated programming systems, we propose Feedback-driven Adaptive Long/short-term memory reinforced Coding OptimizatioN (i.e., FALCON). FALCON leverages long-term memory to retain and apply learned knowledge, short-term memory to incorporate immediate feedback, and meta-reinforcement learning with feedback rewards to address global-local bi-level optimization and enhance adaptability across diverse code generation tasks. Extensive experiments show that FALCON achieves state-of-the-art performance, outperforming other reinforcement learning methods by over 4.5% on MBPP and 6.1% on Humaneval, with the code publicly available. https://anonymous.4open.science/r/FALCON-3B64/README.md.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Multimedia and Expo
Subtitle of host publicationJourney to the Center of Machine Imagination, ICME 2025 - Conference Proceedings
PublisherIEEE Computer Society
ISBN (Electronic)9798331594954
DOIs
StatePublished - 2025
Externally publishedYes
Event2025 IEEE International Conference on Multimedia and Expo, ICME 2025 - Nantes, France
Duration: 30 Jun 20254 Jul 2025

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2025 IEEE International Conference on Multimedia and Expo, ICME 2025
Country/TerritoryFrance
CityNantes
Period30/06/254/07/25

Keywords

  • Code generation
  • Diverse Feedback
  • Reinforcement Learning

Fingerprint

Dive into the research topics of 'FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding OptimizatioN'. Together they form a unique fingerprint.

Cite this