SANER 2025
Tue 4 - Fri 7 March 2025 Montréal, Québec, Canada
Thu 6 Mar 2025 11:00 - 11:15 at M-1410 - Program Analysis Chair(s): Rrezarta Krasniqi

Automated Program Repair (APR) aims to enhance software reliability by automatically generating bug-fixing patches. Recent work has improved the state-of-the-art of APR by fine-tuning pre-trained large language models (LLMs), such as CodeT5, for APR. However, the effectiveness of fine-tuning becomes weakened in data scarcity scenarios, and data scarcity can be a common issue in practice, limiting fine-tuning performance. To alleviate this limitation, this paper adapts prompt tuning for enhanced in APR and conducts a comprehensive study to evaluate its effectiveness in data scarcity scenarios, using three LLMs of different sizes and six diverse datasets across four programming languages. Prompt tuning rewrites the input of a model by adding extra prompt tokens and tunes both the model and the prompts on a small dataset. These tokens can provide task-specific knowledge that improves the model for APR, which is especially critical in data scarcity scenarios. Moreover, domain knowledge has proven crucial in many code intelligence tasks, but existing studies fail to leverage domain knowledge during the prompt tuning for APR. To close this gap, we introduce knowledge prompt tuning, an approach that adapts prompt tuning with six distinct types of code- or bug-related domain knowledge for APR. Our work, to the best of our knowledge, is the first to adapt and evaluate prompt tuning and the effectiveness of code- or bug-related domain knowledge for APR, particularly under data scarcity settings. Our evaluation results show that prompt tuning with knowledge generally outperforms fine-tuning under various experimental settings, achieving an average improvement of 87.33% over fine-tuning in data scarcity scenarios.

Thu 6 Mar

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
Program AnalysisResearch Papers at M-1410
Chair(s): Rrezarta Krasniqi University of North Carolina at Charlotte
11:00
15m
Talk
Adapting Knowledge Prompt Tuning for Enhanced Automated Program Repair
Research Papers
Xuemeng Cai Singapore Management University, Lingxiao Jiang Singapore Management University
11:15
15m
Talk
A Metric for Measuring the Impact of Rare Paths on Program Coverage
Research Papers
Leo St. Amour Virginia Tech, Eli Tilevich Virginia Tech, Muhammad Ali Gulzar Virginia Tech
11:30
15m
Talk
A Progressive Transformer for Unifying Binary Code Embedding and Knowledge Transfer
Research Papers
Hanxiao Lu Columbia University, Hongyu Cai Purdue University, Yiming Liang Purdue University, Antonio Bianchi Purdue University, Z. Berkay Celik Purdue University
11:45
15m
Talk
Is This You, LLM? Recognizing AI-written Programs with Multilingual Code Stylometry
Research Papers
Andrea Gurioli DISI - University of Bologna, Maurizio Gabbrielli DISI - University of Bologna, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
Pre-print
12:00
15m
Talk
SpeedGen: Enhancing Code Efficiency through Large Language Model-Based Performance Optimization
Research Papers
Nils Purschke Technical University of Munich, Sven Kirchner Technical University of Munich, Alois Knoll Technical University of Munich
12:15
15m
Talk
StriCT-BJ: A String Constraint Benchmark from Real Java Programs
Research Papers
Chi Zhang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Jian Zhang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences