FSE 2025
Mon 23 - Fri 27 June 2025 Trondheim, Norway
co-located with ISSTA 2025
Tue 24 Jun 2025 14:40 - 15:00 at Cosmos Hall - LLM for SE 2 Chair(s): Jialun Cao

Large code models (LCMs) have remarkably advanced the field of code generation. Despite their impressive capabilities, they still face practical deployment issues, such as high inference costs, limited accessibility of proprietary LCMs, and adaptability issues of ultra-large LCMs. These issues highlight the critical need for more accessible, lightweight yet effective LCMs. Knowledge distillation (KD) offers a promising solution, which transfers the programming capabilities of larger, advanced LCMs (Teacher) to smaller, less powerful LCMs (Student). However, existing KD methods often lack consideration of fault knowledge and rely on static seed knowledge, which limits their effectiveness.

In this paper, we propose a novel Self-Paced knOwledge DistillAtion framework, named SODA, aims at developing lightweight yet Effective student LCMs via continually transferring the programming capabilities from advanced teacher LCMs. SODA consists of three stages in one cycle: (1) Correct-and-Fault Knowledge Delivery stage aims at improving the student model’s capability to recognize errors while ensuring its basic programming skill during the knowledge transferring, which involves correctness-aware supervised learning and fault-aware contrastive learning methods. (2) Multi-view Feedback stage aims at measuring the quality of results generated by the student model from two views, including model-based and static tool-based measurement; (3) Feedback-based Knowledge Update stage aims at updating the student model adaptively by generating new questions at different difficulty levels, in which the difficulty levels are categorized based on the feedback in the last stage. By performing the training cycle iteratively, the student model is continuously refined through learning more advanced programming skills from the teacher model. We compare SODA with four state-of-the-art KD approaches in the code generation task across seven programming languages. Experimental results show that SODA improves the student model by 65.96% in terms of average Pass@1, outperforming the best baseline PERsD by 29.85%. Based on the proposed SODA framework, we develop SodaCoder, a series of lightweight yet effective LCMs with less than 7B parameters, which outperform 15 LCMs with less than or equal to 16B parameters. Notably, SodaCoder-DS 6.7B, built on DeepseekCoder-6.7B, even surpasses the prominent ChatGPT on average Pass@1 across seven programming languages (66.4 vs. 61.3).

Tue 24 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:30
LLM for SE 2Research Papers / Industry Papers / Ideas, Visions and Reflections at Cosmos Hall
Chair(s): Jialun Cao Hong Kong University of Science and Technology
14:00
20m
Talk
Migrating Code At Scale With LLMs At Google
Industry Papers
Celal Ziftci Google, Stoyan Nikolov Google, Inc., Anna Sjovall Google, Inc., Bo Kim Google, Daniele Codecasa Google, Inc., Max Kim Google
14:20
20m
Talk
Integrating Large Language Models and Reinforcement Learning for Non-Linear Reasoning
Research Papers
Yoav Alon University of Bristol, Cristina David University of Bristol
DOI
14:40
20m
Talk
Smaller but Better: Self-Paced Knowledge Distillation for Lightweight yet Effective LCMs
Research Papers
Yujia Chen Harbin Institute of Technology, Shenzhen, Yang Ye Huawei Cloud Computing Technologies Co., Ltd., Zhongqi Li Huawei Cloud Computing Technologies Co., Ltd., Yuchi Ma Huawei Cloud Computing Technologies, Cuiyun Gao Harbin Institute of Technology, Shenzhen
DOI
15:00
10m
Talk
Enabling Scalable Proactive Workspaces With Environment-Wide Context
Ideas, Visions and Reflections
Nick Bradley University of British Columbia, Thomas Fritz University of Zurich, Reid Holmes University of British Columbia
15:10
20m
Talk
Bridging Operator Semantic Inconsistencies: A Source-level Cross-framework Model Conversion Approach
Research Papers
Xingpei Li National University of Defense Technology, China, Yan Lei Chongqing University, Zhouyang Jia National University of Defense Technology, Yuanliang Zhang National University of Defense Technology, Haoran Liu National University of Defense Technology, Liqian Chen National University of Defense Technology, Wei Dong National University of Defense Technology, Shanshan Li National University of Defense Technology
DOI

Information for Participants
Tue 24 Jun 2025 14:00 - 15:30 at Cosmos Hall - LLM for SE 2 Chair(s): Jialun Cao
Info for room Cosmos Hall:

This is the main event hall of Clarion Hotel, which will be used to host keynote talks and other plenary sessions. The FSE and ISSTA banquets will also happen in this room.

The room is just in front of the registration desk, on the other side of the main conference area. The large doors with numbers “1” and “2” provide access to the Cosmos Hall.

:
:
:
: