SEER: Enhancing Chain-of-Thought Code Generation through Self-Exploring Deep Reasoning
Code generation, the task of creating executable programs from natural language requirements, has recently seen tremendous advances through Chain-of-Thought (CoT) reasoning, which enables Large Language Models (LLMs) to develop high-level reasoning plans before writing code. Recent research has proposed various methods to enhance models’ CoT reasoning for code generation such as prompt engineering and supervised fine-tuning. However, existing approaches still face three critical limitations: (1) limited exploration of diverse reasoning paths, which constrains generalization across various programming scenarios, (2) lack of quality assessment for intermediate reasoning steps, which hampers the reliability of the generated plans and code, and (3) the potential negative impact of “overthinking”, potentially leading to unnecessarily complex and incorrect solutions. To address these limitations, we frame CoT code generation as a decision making problem and present SEER, a SElf-Exploring deep Reasoning framework that enables accurate and adaptive reasoning for code generation. SEER introduces three key components: (1) Diverse reasoning path exploration, which aims at exploring diverse reasoning paths and annotating intermediate steps without relying on manual experts or closed-source proprietary models; (2) Reasoning quality-aware model training, which trains a policy model for generating candidate reasoning steps and a value model for assessing their quality; and (3) Adaptive CoT reasoning, which dynamically switches between direct generation and step-by-step reasoning for different problems. Experiments on state-of-the-art code LLMs DeepSeek-Coder and Qwen2.5-Coder demonstrate that SEER achieves remarkable performance gains across three popular code generation benchmarks, consistently outperforming all baseline methods and achieving absolute improvements by 4.2% ∼ 9.3% in MBPP, 1.9% ∼ 9.1% in HumanEval and 3.5% ∼ 5.3% in LiveCodeBench, respectively.
Wed 15 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
14:00 - 15:30 | AI for Software Engineering 6Research Track at Europa II Chair(s): Miryung Kim UCLA and Amazon Web Services | ||
14:00 15mTalk | Cobblestone: A Divide-and-Conquer Approach for Automating Formal Verification Research Track Saketh Ram Kasibatla UC San Diego, Arpan Agrawal University of Illinois Urbana-Champaign, Yuriy Brun University of Massachusetts, Sorin Lerner University of California at San Diego, Talia Lily Ringer University of Illinois Urbana-Champaign, Emily First Rutgers University DOI Pre-print | ||
14:15 15mTalk | RISE: Rule-Driven SQL Dialect Translation via Query Reduction Research Track Xudong Xie Institute of Software Chinese Academy of Sciences, China, Yuwei Zhang Institute of Software Chinese Academy of Sciences, Wensheng Dou Institute of Software Chinese Academy of Sciences, Yu Gao Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Ziyu Cui Institute of Software at Chinese Academy of Sciences, Jiansen Song Institute of Software at Chinese Academy of Sciences, Rui Yang Institute of Software, Chinese Academy of Sciences, Jun Wei Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences | ||
14:30 15mTalk | RepoScope: Leveraging Call Chain-Aware Multi-View Context for Repository-Level Code Generation Research Track Yang Liu , Li Zhang Beihang University, Fang Liu Beihang University, Zhuohang Wang Beihang University, Donglin Wei Beihang University, Zhishuo Yang Beihang University, Kechi Zhang Peking University, China, Jia Li , Lin Shi Beihang University Pre-print | ||
14:45 15mTalk | What to Retrieve for Effective Retrieval-Augmented Code Generation? An Empirical Study and Beyond Research Track Wenchao Gu Technical University of Munich, Juntao Chen Sun Yat-Sen University, Yanlin Wang Sun Yat-sen University, Tianyue Jiang Sun Yat-sen University, Xingzhe Li Sun Yat-Sen University, Mingwei Liu Sun Yat-Sen University, Xilin Liu Huawei Cloud, Yuchi Ma Huawei Cloud Computing Technologies, Zibin Zheng Sun Yat-sen University | ||
15:00 15mTalk | SEER: Enhancing Chain-of-Thought Code Generation through Self-Exploring Deep Reasoning Research Track Shuzheng Gao Chinese University of Hong Kong, Chaozheng Wang The Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Shenzhen, Michael Lyu The Chinese University of Hong Kong Media Attached | ||
15:15 15mTalk | SmartC2Rust: Iterative, Feedback-Driven C-to-Rust Translation via Large Language Models for Safety and Equivalence Research Track Momoko Shiraishi The University of Tokyo, Yinzhi Cao Johns Hopkins University, Takahiro Shinagawa The University of Tokyo | ||