Beyond PEFT: Layer-Wise Optimization for More Effective and Efficient Large Code Model Tuning (FSE 2025 - Research Papers)

Who

Chaozheng Wang, jiafeng , Shuzheng Gao, Cuiyun Gao, Li Zongjie, Ting Peng, Hailiang Huang, Yuetang Deng, Michael Lyu

Track

FSE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 25 Jun 2025 11:00 - 11:20 at Cosmos Hall - SE and AI 2 Chair(s): Massimiliano Di Penta

Abstract

Large Code Models (LCMs) have demonstrated remarkable effectiveness across various code intelligence tasks. Supervised fine-tuning is essential to optimize their performance for specific downstream tasks. Compared with the traditional full-parameter fine-tuning (FFT) method, Parameter-Efficient Fine-Tuning (PEFT) methods can train LCMs with substantially reduced resource consumption and have gained widespread attention among researchers and practitioners. While existing studies have explored PEFT methods for code intelligence tasks, they have predominantly focused on a limited subset of scenarios, such as code generation with publicly available datasets, leading to constrained generalizability of the findings. To mitigate the limitation, we conduct a comprehensive study on exploring the effectiveness of the PEFT methods, which involves five code intelligence tasks containing both public and private data. Our extensive experiments reveal a considerable performance gap between PEFT methods and FFT, which is contrary to the findings of existing studies. We also find that this disparity is particularly pronounced in tasks involving private data.

To improve the tuning performance for LCMs while reducing resource utilization during training, we propose a Layer-Wise Optimization (\method) strategy in the paper. /method incrementally updates the parameters of each layer in the whole model architecture, without introducing any additional component and inference overhead. Experiments across five LCMs and five code intelligence tasks demonstrate that LWO trains LCMs more effectively and efficiently compared to previous PEFT methods, with significant improvements in tasks using private data. For instance, in the line-level code completion task using our private code repositories, LWO outperforms the state-of-the-art LoRA method by 22% and 12% in terms of accuracy and BLEU scores, respectively. Furthermore, \method can enable more efficient LCM tuning, reducing the training time by an average of 42.7% compared to LoRA.

DOI

https://doi.org/10.1145/3729341

Chaozheng Wang

The Chinese University of Hong Kong

jiafeng

University of Electronic Science and Technology of China

Shuzheng Gao

Chinese University of Hong Kong

China

Cuiyun Gao

Harbin Institute of Technology, Shenzhen

China

Li Zongjie

Hong Kong University of Science and Technology

China

Ting Peng

Tencent Inc.

Hailiang Huang

Tencent Inc.

Yuetang Deng

Tencent

China

Michael Lyu

Chinese University of Hong Kong

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 25 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:30	SE and AI 2Ideas, Visions and Reflections / Research Papers at Cosmos Hall Chair(s): Massimiliano Di Penta University of Sannio, Italy

11:00 20m Talk		Beyond PEFT: Layer-Wise Optimization for More Effective and Efficient Large Code Model Tuning Research Papers Chaozheng Wang The Chinese University of Hong Kong, jiafeng University of Electronic Science and Technology of China, Shuzheng Gao Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Shenzhen, Li Zongjie Hong Kong University of Science and Technology, Ting Peng Tencent Inc., Hailiang Huang Tencent Inc., Yuetang Deng Tencent, Michael Lyu Chinese University of Hong Kong DOI
11:20 20m Talk		Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers Research Papers Lam Nguyen Tung Monash University, Australia, Steven Cho The University of Auckland, New Zealand, Xiaoning Du Monash University, Neelofar Neelofar Royal Melbourne Institure of Techonlogy (RMIT), Valerio Terragni University of Auckland, Stefano Ruberto JRC European Commission, Aldeida Aleti Monash University DOI Media Attached File Attached
11:40 20m Talk		A Causal Learning Framework for Enhancing Robustness of Source Code Models Research Papers Junyao Ye Huazhong University of Science and Technology, Zhen Li Huazhong University of Science and Technology, Xi Tang Huazhong University of Science and Technology, Deqing Zou Huazhong University of Science and Technology, Shouhuai Xu University of Colorado Colorado Springs, Qiang Weizhong Huazhong University of Science and Technology, Hai Jin Huazhong University of Science and Technology DOI
12:00 20m Talk		Eliminating Backdoors in Neural Code Models for Secure Code Understanding Research Papers Weisong Sun Nanjing University, Yuchen Chen Nanjing University, Chunrong Fang Nanjing University, Yebo Feng Nanyang Technological University, Yuan Xiao Nanjing University, An Guo Nanjing University, Quanjun Zhang School of Computer Science and Engineering, Nanjing University of Science and Technology, Zhenyu Chen Nanjing University, Baowen Xu Nanjing University, Yang Liu Nanyang Technological University DOI
12:20 10m Talk		Reduction Fusion for Optimized Distributed Data-Parallel Computations via Inverse Recomputation Ideas, Visions and Reflections Haoxiang Lin Microsoft Research, Yang Wang Microsoft Research Asia, Yanjie Gao Microsoft Research, Hongyu Zhang Chongqing University, Ming Wu Zero Gravity Labs, Mao Yang Microsoft Research DOI Pre-print

Information for Participants

Wed 25 Jun 2025 11:00 - 12:30 at Cosmos Hall - SE and AI 2 Chair(s): Massimiliano Di Penta

Info for room Cosmos Hall:

This is the main event hall of Clarion Hotel, which will be used to host keynote talks and other plenary sessions. The FSE and ISSTA banquets will also happen in this room.

The room is just in front of the registration desk, on the other side of the main conference area. The large doors with numbers “1” and “2” provide access to the Cosmos Hall.