TCSE logo 
 Sigsoft logo
Sustainability badge

This program is tentative and subject to change.

Fri 2 May 2025 16:15 - 16:30 at 212 - AI for Process

Evaluating the quality of commit messages is a challenging task in software engineering. Existing evaluation approaches, such as automatic metrics like BLEU, ROUGE and METEOR, as well as manual human assessments have notable limitations. Automatic metrics often overlook semantic relevance and context, while human evaluations are time consuming and costly. To address these challenges, we explore the potential of using Large Language Models (LLMs) as an alternative method for commit message evaluation. We conducted two tasks using state-of-the-art LLMs, GPT-4o, LLaMA 3.1 (70B and 8B), and Mistral Large, to assess their capability in evaluating commit messages. Our findings show that LLMs can effectively identify relevant commit messages and align well with human judgment, demonstrating their potential to serve as reliable automated evaluators. This study provides a new perspective on utilizing LLMs for commit message assessment, paving the way for scalable and consistent evaluation methodologies in software engineering.

This program is tentative and subject to change.

Fri 2 May

Displayed time zone: Eastern Time (US & Canada) change

16:00 - 17:30
16:00
15m
Talk
OptCD: Optimizing Continuous Development
Demonstrations
Talank Baral George Mason University, Emirhan Oğul Middle East Technical University, Shanto Rahman The University of Texas at Austin, August Shi The University of Texas at Austin, Wing Lam George Mason University
16:15
15m
Talk
LLMs as Evaluators: A Novel Approach to Commit Message Quality Assessment
New Ideas and Emerging Results (NIER)
Abhishek Kumar Indian Institute of Technology, Kharagpur, Sandhya Sankar Indian Institute of Technology, Kharagpur, Sonia Haiduc Florida State University, Partha Pratim Das Indian Institute of Technology, Kharagpur, Partha Pratim Chakrabarti Indian Institute of Technology, Kharagpur
16:30
15m
Talk
Towards Realistic Evaluation of Commit Message Generation by Matching Online and Offline Settings
SE In Practice (SEIP)
Petr Tsvetkov JetBrains Research, Aleksandra Eliseeva JetBrains Research, Danny Dig University of Colorado Boulder, JetBrains Research, Alexander Bezzubov JetBrains, Yaroslav Golubev JetBrains Research, Timofey Bryksin JetBrains Research, Yaroslav Zharov JetBrains Research
Pre-print
16:45
15m
Talk
Enhancing Differential Testing: LLM-Powered Automation in Release Engineering
SE In Practice (SEIP)
Ajay Krishna Vajjala George Mason University, Arun Krishna Vajjala George Mason University, Carmen Badea Microsoft Research, Christian Bird Microsoft Research, Robert DeLine Microsoft Research, Jason Entenmann Microsoft Research, Nicole Forsgren Microsoft Research, Aliaksandr Hramadski Microsoft, Sandeepan Sanyal Microsoft, Oleg Surmachev Microsoft, Thomas Zimmermann University of California, Irvine, Haris Mohammad Microsoft, Jade D'Souza Microsoft, Mikhail Demyanyuk Microsoft
17:00
15m
Talk
How much does AI impact development speed? An enterprise-based randomized controlled trial
SE In Practice (SEIP)
Elise Paradis Google, Inc, Kate Grey Google, Quinn Madison Google, Daye Nam Google, Andrew Macvean Google, Inc., Nan Zhang Google, Ben Ferrari-Church Google, Satish Chandra Google, Inc
17:15
15m
Talk
Using Reinforcement Learning to Sustain the Performance of Version Control Repositories
New Ideas and Emerging Results (NIER)
Shane McIntosh University of Waterloo, Luca Milanesio GerritForge Inc., Antonio Barone GerritForge Inc., Jacek Centkowski GerritForge Inc., Marcin Czech GerritForge Inc., Fabio Ponciroli GerritForge Inc.
:
:
:
: