An Empirical Study of Cross-Project Pull Request Recommendation in GitHub
As a core contribution merge mechanism in distributed collaborative development, pull requests contain valuable knowledge of code evolution and issue resolution. With the co-evolution of multiple projects in a software ecosystem, relevant and similar issues can arise across different projects. Leveraging existing solutions in pull requests (PRs) through cross-project pull request recommendation (CPR) can enrich context knowledge and improve the efficiency of issue resolution. However, the characteristics of CPR and its effectiveness in the process of issue resolution still remain unclear. To bridge this gap, we conduct an empirical study of the CPR on GitHub. We first extract 4,445 CPRs from 2,500 open source projects and quantitatively analyze the characteristics of CPR. Then we conduct a qualitative analysis of developers’ comments on CPR to understand the influence of the CPR. We also use a regression model to explore the impact of CPRs on issue resolution. Our main findings are as follows: (1) Experienced contributors in target projects make most of the CPRs and their CPRs are more timely than inexperienced contributors; (2) In CPR dataset, bugs constitute the largest proportion of target issue types, followed by enhancements, features and questions; (3) Nearly half of the CPRs are accepted by issue participants; (4) A greater number of the CPRs contribute indirectly to solving the target issue by offering solutions and contextual information, rather than providing appropriate code that can be directly applied to the issue; (5) Most of CPR-related factors have a significant impact on issue resolution delay. Among these, recommendation latency has the most significant impact, followed by the type of recommender. Our work has important insights into PR recommendation and offers important guidance for developers on recommending cross-project PRs to resolve the mushrooming issues.
Wed 4 DecDisplayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change
14:00 - 15:30 | Session (4)Technical Track / ERA - Early Research Achievements at Room 4 (Xianglin Ballroom) Chair(s): Lina Gong Nanjing University of Aeronautics and Astronautic | ||
14:00 30mTalk | An Empirical Study of Cross-Project Pull Request Recommendation in GitHub Technical Track Wenyu Xu national university of defense technology, Yao Lu National University of Defense Technology, Xunhui Zhang National University of Defense Technology, China, Tanghaoran Zhang national university of defense technology, Xinjun Mao National University of Defense Technology, Bo Lin National University of Defense Technology | ||
14:30 30mTalk | FRELinker: A Novel Issue-Commit Link Recovery Model Based on Feature Refinement and Expansion with Multi-Classifier Fusion Technical Track Bangchao Wang Wuhan Textile University, Xinyu He School of Computer Science and Artificial Intelligence, Wuhan Textile University, Hongyan Wan Wuhan Textile University, Xiaoxiao Li School of Computer Science and Artificial Intelligence, Wuhan Textile University, Jiaxu Zhu School of Computer Science and Artificial Intelligence, Wuhan Textile University, Yukun Cao School of Computer Science and Artificial Intelligence, Wuhan Textile University | ||
15:00 20mTalk | Towards Filtering Out Deficient Pull Requests Collected through the GitHub API ERA - Early Research Achievements Bowen Tang Ritsumeikan University, Xiqin Lu Ritsumeikan University, Katsuhisa Maruyama Ritsumeikan University |