TCSE logo 
 Sigsoft logo
Sustainability badge

This program is tentative and subject to change.

Fri 2 May 2025 14:30 - 14:45 at 212 - AI for Analysis 5

Language models have improved by orders of magnitude with the recent emergence of Transformer-based Large Language Models (LLMs). LLMs have demonstrated their ability to generate “natural” code that is highly similar to code written by professional developers. One intermediate value an LLM can emit is entropy, which measures the naturalness of a token of code. We hypothesize that entropy can be used to improve the performance of Automated Program Repair (APR) tasks. While much progress has been made in Automated Program Repair (APR), fault localization techniques suffer from a lack of diversity in ranking scores, patch generation tools tend to be inefficient as all tests need to run before determining if a patch is likely to be correct, and patch ranking often suffers from the test-suite over-fitting problem. However, using an LLM directly for APR introduces concerns for training data leakage. In this work, we introduce a novel way of using the entropy of LLMs in combination with prior APR tools to improve all stages of APR. By using only the prefix and suffix context of a line or block of code to describe naturalness, we can use LLMs to localize faults and rank patches all while eliminating the dependency for test-suites. We show that entropy is highly complementary with prior fault localization tools. Our proposed method achieves a 108% top-1 score improvement over SBFL. When using entropy for patch ranking and classification, our proposed method can rank correct patches more effectively than state-of-the-art machine learning tools with an 49% improvement in top-1. Our work suggests that LLMs can be an effective addition to compliment prior APR tasks while minimizing both the test-suite over-fitting problem and the LLM data leakage problem.

This program is tentative and subject to change.

Fri 2 May

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30
14:00
15m
Talk
3DGen: AI-Assisted Generation of Provably Correct Binary Format Parsers
Research Track
Sarah Fakhoury Microsoft Research, Markus Kuppe Microsoft Research, Shuvendu K. Lahiri Microsoft Research, Tahina Ramananandro Microsoft Research, Nikhil Swamy Microsoft Research
14:15
15m
Talk
Aligning the Objective of LLM-based Program Repair
Research Track
Junjielong Xu The Chinese University of Hong Kong, Shenzhen, Ying Fu Chongqing University, Shin Hwei Tan Concordia University, Pinjia He Chinese University of Hong Kong, Shenzhen
Pre-print
14:30
15m
Talk
Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models
Research Track
Aidan Z.H. Yang Carnegie Mellon University, Sophia Kolak Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University, Ruben Martins Carnegie Mellon University, Claire Le Goues Carnegie Mellon University
14:45
15m
Talk
The Fact Selection Problem in LLM-Based Program Repair
Research Track
Nikhil Parasaram Uber Amsterdam, Huijie Yan University College London, Boyu Yang University College London, Zineb Flahy University College London, Abriele Qudsi University College London, Damian Ziaber University College London, Earl T. Barr University College London, Sergey Mechtaev Peking University
15:00
15m
Talk
Towards Understanding the Characteristics of Code Generation Errors Made by Large Language Models
Research Track
Zhijie Wang University of Alberta, Zijie Zhou University of Illinois Urbana-Champaign, Da Song University of Alberta, Yuheng Huang University of Alberta, Canada, Shengmai Chen Purdue University, Lei Ma The University of Tokyo & University of Alberta, Tianyi Zhang Purdue University
Pre-print
15:15
15m
Talk
Beyond Syntax: How Do LLMs Understand Code?
New Ideas and Emerging Results (NIER)
Marc North Durham University, Amir Atapour-Abarghouei Durham University, Nelly Bencomo Durham University
:
:
:
: