ASE 2024
Sun 27 October - Fri 1 November 2024 Sacramento, California, United States
Thu 31 Oct 2024 13:45 - 14:00 at Magnoila - Code and issue report Chair(s): Baishakhi Ray

Recent studies revealed that traditional techniques of understanding code changes are less effective than techniques directly prompting large language models (LLMs). However, current techniques utilizing LLMs heavily rely on commercial, large-scale ones such as GPT-3.5 and GPT-4, preventing their widespread practical deployment. This paper seeks to investigate the feasibility of deploying small-scale LLMs while maintaining comparable or superior performance to commercial and larger-scale LLMs in terms of code change understanding. To achieve this, we have developed a small yet high-quality dataset called HQCM which was reviewed, revised, and validated by five human experts. After finetuning small-scale (7B and 220M) LLMs via it, our evaluation has confirmed the significant profits brought by HQCM and has indicated that small-scale LLMs, after fine-tuning by HQCM, can achieve superior performance in change understanding for change summarization, change classification, and code refinement, compared to state-of-the-art baselines and larger-scale (>=70B) LLMs. This study supports the use of small-scale LLMs in industry or resource-constrained settings like embedded systems, distinguishing our work from others.

Thu 31 Oct

Displayed time zone: Pacific Time (US & Canada) change

13:30 - 15:00
Code and issue reportResearch Papers at Magnoila
Chair(s): Baishakhi Ray Columbia University, New York; AWS AI Lab
13:30
15m
Talk
PatUntrack: Automated Generating Patch Examples for Issue Reports without Tracked Insecure Code
Research Papers
Ziyou Jiang Institute of Software at Chinese Academy of Sciences, Lin Shi Beihang University, Guowei Yang University of Queensland, Qing Wang Institute of Software at Chinese Academy of Sciences
DOI Pre-print
13:45
15m
Talk
Understanding Code Changes Practically with Small-Scale Language Models
Research Papers
Cong Li Zhejiang University; Ant Group, Zhaogui Xu Ant Group, Peng Di Ant Group, Dongxia Wang Zhejiang University, Zheng Li Ant Group, Qian Zheng Ant Group
14:00
15m
Talk
DRMiner: Extracting Latent Design Rationale from Jira Issue LogsACM SigSoft Distinguished Paper Award
Research Papers
Jiuang Zhao Beihang University, Zitian Yang Beihang University, Li Zhang Beihang University, Xiaoli Lian Beihang University, China, Donghao Yang Beihang University, Xin Tan Beihang University
14:15
15m
Talk
An Empirical Study on Learning-based Techniques for Explicit and Implicit Commit Messages Generation
Research Papers
Zhiquan Huang Sun Yat-sen University, Yuan Huang Sun Yat-sen University, Xiangping Chen Sun Yat-sen University, Xiaocong Zhou School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China, Changlin Yang Sun Yat-sen University, Zibin Zheng Sun Yat-sen University
14:30
15m
Talk
RCFG2Vec: Considering Long-Distance Dependency for Binary Code Similarity Detection
Research Papers
Weilong Li School of Computer Science and Engineering,Sun Yat-sen University, Jintian Lu College of Computer Science and Engineering, Jishou University, Ruizhi Xiao School of Computer Science and Engineering,Sun Yat-sen University, Pengfei Shao China Southern Power Grid Digital Grid Group Information and Telecommunication Technology Co., Ltd., Shuyuan Jin School of Computer Science and Engineering,Sun Yat-sen University
14:45
15m
Talk
ChatBR: Automated assessment and improvement of bug report quality using ChatGPT
Research Papers
Lili Bo Yangzhou University, wangjie ji Yangzhou University, Xiaobing Sun Yangzhou University, Ting Zhang Singapore Management University, Xiaoxue Wu Yangzhou University, Ying Wei Yangzhou University