TCSE logo 
 Sigsoft logo
Sustainability badge

This program is tentative and subject to change.

Wed 30 Apr 2025 17:00 - 17:15 at 204 - Program Comprehension 2

Commit messages are critical for code comprehension and software maintenance. Writing a high-quality message requires skill and effort. To support developers and reduce their effort on this task, several approaches have been proposed to automatically generate commit messages. Despite the promising performance reported, we have identified three significant and prevalent threats in these automated approaches: 1) the datasets used to train and evaluate these approaches contain a considerable amount of �noise’; 2) current approaches only consider commits of a limited diff size; and 3) current approaches can only generate the subject of a commit message, not the message body. The first limitation may let the models �learn’ inappropriate messages in the training stage, and also lead to inflated performance results in their evaluation. The other two threats can considerably weaken the practical usability of these approaches. Further, with the rapid emergence of large language models (LLMs) that show superior performance in many software engineering tasks, it is worth asking: can LLMs address the challenge of long diffs and whole message generation? This article first reports the results of an empirical study to assess the impact of these three threats on the performance of the state-of-the-art auto generators of commit messages. We collected commit data of the Top 1,000 most-starred Java projects in GitHub and systematically removed noisy commits with bot-submitted and meaningless messages. We then compared the performance of four approaches representative of the state-of-the-art before and after the removal of noisy messages, or with different lengths of commit diffs. We also conducted a qualitative survey with developers to investigate their perspectives on simply generating message subjects. Finally, we evaluate the performance of two representative LLMs, namely UniXcoder and ChatGPT, in generating more practical commit messages. The results demonstrate that generating commit messages is of great practical value, considerable work is needed to mature the current state-of-the-art, and LLMs can be an avenue worth trying to address the current limitations. Our analyses provide insights for future work to achieve better performance in practice.

This program is tentative and subject to change.

Wed 30 Apr

Displayed time zone: Eastern Time (US & Canada) change

16:00 - 17:30
Program Comprehension 2Journal-first Papers / Research Track at 204
16:00
15m
Talk
Enhancing Fault Localization in Industrial Software Systems via Contrastive Learning
Research Track
Chun Li Nanjing University, Hui Li Samsung Electronics (China) R&D Centre, Zhong Li , Minxue Pan Nanjing University, Xuandong Li Nanjing University
16:15
15m
Talk
On the Understandability of MLOps System Architectures
Journal-first Papers
Stephen John Warnett University of Vienna, Uwe Zdun University of Vienna
Link to publication DOI
16:30
15m
Talk
Bridging the Language Gap: An Empirical Study of Bindings for Open Source Machine Learning Libraries Across Software Package Ecosystems
Journal-first Papers
Hao Li Queen's University, Cor-Paul Bezemer University of Alberta
16:45
15m
Talk
Understanding Code Understandability Improvements in Code Reviews
Journal-first Papers
Delano Hélio Oliveira , Reydne Bruno dos Santos UFPE, Benedito Fernando Albuquerque de Oliveira Federal University of Pernambuco, Martin Monperrus KTH Royal Institute of Technology, Fernando Castor University of Twente, Fernanda Madeiral Vrije Universiteit Amsterdam
17:00
15m
Talk
Automatic Commit Message Generation: A Critical Review and Directions for Future Work
Journal-first Papers
Yuxia Zhang Beijing Institute of Technology, Zhiqing Qiu Beijing Institute of Technology, Klaas-Jan Stol Lero; University College Cork; SINTEF Digital , Wenhui Zhu Beijing Institute of Technology, Jiaxin Zhu Institute of Software at Chinese Academy of Sciences, Yingchen Tian Tmall Technology Co., Hui Liu Beijing Institute of Technology
17:15
7m
Talk
Efficient Management of Containers for Software Defined Vehicles
Journal-first Papers
Anwar Ghammam Oakland University, Rania Khalsi University of Michigan - Flint, Marouane Kessentini University of Michigan - Flint, Foyzul Hassan University of Michigan at Dearborn
:
:
:
: