APSEC 2024
Tue 3 - Fri 6 December 2024 China
Thu 5 Dec 2024 14:30 - 15:00 at Room 1 (Zunhui Room) - Session (8) Chair(s): Zhou Yang

Pull Requests (PRs) are a collaborative mechanism in GitHub, allowing developers to merge their code changes into another branch of the software repository. The PR title serves as a summary of the PR and needs to accurately and concisely describe the specific changes made, which is useful for reviewers and other developers to review and understand. There are many existing methods for automatically generating PR titles, most of which are based on pre-trained models. Although these methods are effective, pre-trained models often require extensive fine-tuning for specific tasks.Compared to pre-trained models, large language models (LLMs) possess superior semantic understanding capabilities. As a foundational model, they can solve most tasks directly without relying on fine-tuning, providing an alternative solution for PR title generation. However, the capabilities of LLMs in the automatic PR title generation have not been fully explored. To fill this gap, we conducted an empirical study to understand the capabilities of LLMs in PR title generation. Initially, the direct application of LLMs to generate PR titles did not yield satisfactory results. We found that using similar PRs from the dataset as auxiliary information can effectively enhance the title generation capability of LLMs. When the number of most similar PRs used as input increased from 0 to 5, the ROUGE-L F1 score of the titles generated by LLMs increased by an average of 23.48%, with improvements in other metrics as well. In further experiments, we discovered that setting a lower temperature for the LLMs can bring better performance. We then selected the best parameter configuration and compared it with the existing state-of-the-art methods. Our experimental results show that LLMs outperform the state of the art methods in Precision, Recall, and METEOR metrics on the PRTiger dataset. Additionally, human evaluation results indicate that PR titles generated by LLMs receive higher scores in Correctness, Naturalness, and Comprehensibility.

Thu 5 Dec

Displayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change

14:00 - 15:30
Session (8)Technical Track at Room 1 (Zunhui Room)
Chair(s): Zhou Yang Singapore Management University
14:00
30m
Talk
DupLLM: Duplicate Pull Requests Detection Based on Large Language Model
Technical Track
Zhifang Liao Central South University, Pei Liu Monash University, Peng Lan School of Computer Science and Engineering, Central South University, Changsha, China, Ke Sun Central South University
14:30
30m
Talk
Exploring the Potential of Large Language Models in Automatic Pull Request Title Generation: An Empirical Study
Technical Track
YiTao Zuo School of Computer Science and Engineering, Central South University, Changsha, China, Peng Lan School of Computer Science and Engineering, Central South University, Changsha, China, Zhifang Liao Central South University
15:00
30m
Talk
ModelCS: A Two-Stage Framework for Model Search
Technical Track
Lingjun Zhao National University of Defense Technology, Zhouyang Jia National University of Defense Technology, Jiaying Li National University of Defense Technology, Haoran Liu National University of Defense Technology, Linxiao Bai National University of Defense Technology, Shanshan Li National University of Defense Technology