DupLLM: Duplicate Pull Requests Detection Based on Large Language Model
As the scale of projects expands, the concurrent development model adopted by the open source community leads to an increasingly prominent problem of repetitive pull requests (PRs). The large number of rejections caused by duplicate pull requests increases the review workload of project maintainers and reduces the efficiency of pull request review. Therefore, it is very necessary to conduct automated duplicate PR detection. In this study, we propose DupLLM, a framework designed to detect duplicate PRs. The framework generates refined summaries by feeding the content of individual PRs into a large language model (LLM). Subsequently, the resulting summary is vectorized, converting the textual content into a numerical representation. The similarity between PRs is evaluated by calculating the similarity score between PR summary vectors. Ultimately, the model showed better performance than the best existing model, achieving an effect of 0.929 on P@1. This confirms that LLM can also achieve equivalent results in the field of duplicate PR detection as deep learning is used to train on this task, providing a new direction for the application of LLM in the field of software engineering.
Thu 5 DecDisplayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change
14:00 - 15:30 | Session (8)Technical Track at Room 1 (Zunhui Room) Chair(s): Zhou Yang Singapore Management University | ||
14:00 30mTalk | DupLLM: Duplicate Pull Requests Detection Based on Large Language Model Technical Track Zhifang Liao Central South University, Pei Liu Monash University, Peng Lan School of Computer Science and Engineering, Central South University, Changsha, China, Ke Sun Central South University | ||
14:30 30mTalk | Exploring the Potential of Large Language Models in Automatic Pull Request Title Generation: An Empirical Study Technical Track YiTao Zuo School of Computer Science and Engineering, Central South University, Changsha, China, Peng Lan School of Computer Science and Engineering, Central South University, Changsha, China, Zhifang Liao Central South University | ||
15:00 30mTalk | ModelCS: A Two-Stage Framework for Model Search Technical Track Lingjun Zhao National University of Defense Technology, Zhouyang Jia National University of Defense Technology, Jiaying Li National University of Defense Technology, Haoran Liu National University of Defense Technology, Linxiao Bai National University of Defense Technology, Shanshan Li National University of Defense Technology |