Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation (ICSE 2025 - Research Track) - ICSE 2025

Sat 26 April - Sun 4 May 2025 Ottawa, Ontario, Canada

Who

Aaron Imani, Iftekhar Ahmed, Mohammad Moshirpour

Track

ICSE 2025 Research Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Thu 1 May 2025 14:15 - 14:30 at 213 - AI for Program Comprehension 2 Chair(s): Oscar Chaparro

Abstract

Commit messages provide descriptions of the modifications made in a commit using natural language, making them crucial for software maintenance and evolution. Recent developments in Large Language Models (LLMs) have led to their use in generating high-quality commit messages, such as the Omniscient Message Generator (OMG). This method employs GPT-4 to produce state-of-the-art commit messages. However, the use of proprietary LLMs like GPT-4 in coding tasks raises privacy and sustainability concerns, which may hinder their industrial adoption. Considering that open-source LLMs have achieved competitive performance in developer tasks such as compiler validation, this study investigates whether they can be used to generate commit messages that are comparable with OMG. Our experiments show that an open-source LLM can generate commit messages that are comparable to those produced by OMG. In addition, through a series of contextual refinements, we propose lOcal MessagE GenerAtor (OMEGA) , a CMG approach that uses a 4-bit quantized 8B open-source LLM. OMEGA produces state-of-the-art commit messages, surpassing the performance of GPT-4 in practitioners’ preference.

Aaron Imani

University of California, Irvine

Iftekhar Ahmed

University of California at Irvine

United States

Mohammad Moshirpour

University of California, Irvine

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Thu 1 May
Displayed time zone: Eastern Time (US & Canada) change

	14:00 - 15:30	AI for Program Comprehension 2Research Track at 213 Chair(s): Oscar Chaparro William & Mary

	14:00 15m Talk		Code Comment Inconsistency Detection and Rectification Using a Large Language Model Research Track Guoping Rong Nanjing University, YongdaYu Nanjing University, Song Liu Nanjing University, Xin Tan Nanjing University, Tianyi Zhang Nanjing University, Haifeng Shen Southern Cross University, Jidong Hu Zhongxing Telecom Equipment
	14:15 15m Talk		Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation Research Track Aaron Imani University of California, Irvine, Iftekhar Ahmed University of California at Irvine, Mohammad Moshirpour University of California, Irvine
	14:30 15m Talk		HedgeCode: A Multi-Task Hedging Contrastive Learning Framework for Code Search Research Track Gong Chen Wuhan University, Xiaoyuan Xie Wuhan University, Xunzhu Tang University of Luxembourg, Qi Xin Wuhan University, Wenjie Liu Wuhan University
	14:45 15m Talk		Reasoning Runtime Behavior of a Program with LLM: How Far Are We? Research Track Junkai Chen Zhejiang University, Zhiyuan Pan Zhejiang University, Xing Hu Zhejiang University, Zhenhao Li York University, Ge Li Peking University, Xin Xia Huawei
	15:00 15m Talk		Source Code Summarization in the Era of Large Language Models Research Track Weisong Sun Nanjing University, Yun Miao Nanjing University, Yuekang Li UNSW, Hongyu Zhang Chongqing University, Chunrong Fang Nanjing University, Yi Liu Nanyang Technological University, Gelei Deng Nanyang Technological University, Yang Liu Nanyang Technological University, Zhenyu Chen Nanjing University Media Attached
	15:15 15m Talk		Template-Guided Program Repair in the Era of Large Language Models Research Track Kai Huang , Jian Zhang Nanyang Technological University, Xiangxin Meng Beihang University, Beijing, China, Yang Liu Nanyang Technological University File Attached