Leveraging LLM Enhanced Commit Messages to Improve Machine Learning Based Test Case Prioritization (PROMISE 2025)

Thu 26 Jun 2025 Trondheim, Norway

co-located with FSE 2025

Who

Yara Q Mahmoud, Akramul Azim, Ramiro Liscano, Kevin Smith, Yee-Kang Chang, Gkerta Seferi, Qasim Tauseef

Track

PROMISE 2025

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 26 Jun 2025 16:00 - 16:15 at Vega - Session 3 Chair(s): Yinxi Liu

Abstract

In the rapidly evolving landscape of software development, software testing is critical for maintaining code quality and reducing defects. Effective test case prioritization employs techniques to identify defects early and ensure software quality. New avenues of research have explored using machine learning (ML) to automate the process, most current applications leverage a machine learning model using numerical features to prioritize the test cases. This study investigates the enhancement of this process by incorporating text-based features derived from git commit messages, which often include valuable information about code changes. Given that commit messages are often poorly written and inconsistent, we employ a large language model (LLM) to rewrite these messages based on code diffs, with the aim of improving the quality of their format and the information they contain. We then assess whether these refined commit messages, as an additional feature, contribute to better performance of the test case prioritization model. Our preliminary results indicate that the inclusion of LLM-enhanced commit messages leads to a noticeable improvement in prioritization effectiveness, suggesting a promising avenue for integrating natural language processing techniques in software testing workflows.

Yara Q Mahmoud

Ontario Tech University

Canada

Akramul Azim

Ontario Tech University

Canada

Ramiro Liscano

Ontario Tech University

Canada

Kevin Smith

International Business Machines Corporation (IBM)

United Kingdom

Yee-Kang Chang

International Business Machines Corporation (IBM)

Canada

Gkerta Seferi

International Business Machines Corporation (IBM)

United Kingdom

Qasim Tauseef

International Business Machines Corporation (IBM)

United Kingdom

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 26 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

16:00 - 18:00	Session 3PROMISE 2025 at Vega Chair(s): Yinxi Liu Rochester Institute of Technology

16:00 15m Talk		Leveraging LLM Enhanced Commit Messages to Improve Machine Learning Based Test Case Prioritization PROMISE 2025 Yara Q Mahmoud Ontario Tech University, Akramul Azim Ontario Tech University, Ramiro Liscano Ontario Tech University, Kevin Smith International Business Machines Corporation (IBM), Yee-Kang Chang International Business Machines Corporation (IBM), Gkerta Seferi International Business Machines Corporation (IBM), Qasim Tauseef International Business Machines Corporation (IBM)
16:16 14m Talk		Designing and Optimizing Alignment Datasets for IoT Security: A Synergistic Approach with Static Analysis Insights PROMISE 2025 Ahmad Al-Zuraiqi Queen's University Belfast, Desmond Greer Queens University
16:31 14m Talk		Efficient Adaptation of Large Language Models for Smart Contract Vulnerability Detection PROMISE 2025 Fadul Sikder Department of Computer Science and Engineering, The University of Texas at Arlington, Jeff Yu Lei University of Texas at Arlington, Yuede Ji Department of Computer Science and Engineering, The University of Texas at Arlington
16:46 14m Talk		A Combined Approach to Performance Regression Testing Resource Usage Reduction PROMISE 2025 Milad Abdullah Charles University, David Georg Reichelt Lancaster University Leipzig, Leipzig, Germany, Vojtech Horky Charles University, Lubomír Bulej Charles University, Tomas Bures Charles University, Czech Republic, Petr Tuma Charles University
17:01 14m Talk		Security Bug Report Prediction Within and Across Projects: A Comparative Study of BERT and Random Forest PROMISE 2025 Farnaz Soltaniani TU Clausthal, Mohammad Ghafari TU Clausthal, Mohammed Sayagh ETS Montreal, University of Quebec
17:16 9m Talk		Towards Build Optimization Using Digital Twins PROMISE 2025 Henri Aïdasso École de technologie supérieure (ÉTS), Francis Bordeleau École de Technologie Supérieure (ETS), Ali Tizghadam TELUS
17:26 4m Day closing		Closing PROMISE 2025

Information for Participants

Thu 26 Jun 2025 16:00 - 18:00 at Vega - Session 3 Chair(s): Yinxi Liu

Info for room Vega:

Vega is close to the registration desk.

Facing the registration desk, its entrance is on the left, close to the hotel side entrance.