SANER 2025
Tue 4 - Fri 7 March 2025 Montréal, Québec, Canada
Fri 7 Mar 2025 11:15 - 11:30 at L-1710 - Change Management & Program Comprehension Chair(s): Masud Rahman

Abstract—In this work, we propose the EarlyPR framework that identifies and predicts potential pull-request (PR) contributions from an open source software (OSS) project’s forks, which can potentially improve the efficiency of the fork-and-pull based development in OSS projects by supporting early warning of duplicated and rejected contributions, and detection of lost contributions. Unlike traditional, PR-based studies that rely on the descriptions and contents of PRs provided by their creators, which are only available after the PRs are created, EarlyPR makes predictions before the creation of PRs by mining the forks’ commit history. EarlyPR’s task is challenging because of the explosive number of commit subsets in a fork’s commit history that may form PRs, and the absence of resulting, real PR-related information. To tackle the challenges, we adopt the state-of-the art, Transformer-based architecture to extract rich statistical and content information from the forks and their commits to support the prediction of potential PR contributions. And to make the algorithms scalable, we devise a TemporalFilter to find candidate PRs by mimicking the real-world processes of picking subsets of commits from a fork’s commit history when creating PRs. Experimental results on real-world OSS project data suggest that EarlyPR is effective in predicting PRs, which are essentially sets of commits selected from forks to compose these PRs. Experimental results obtained using real-world OSS projects’ and their forks’ data suggest that EarlyPR is effective by achieving a hitting rate of 0.790 and a missing rate of 0.367 by matching the predicted and real PRs under a stringent criterion of IoU > 0.5. We further demonstrate that we can forecast the merging of PRs based on EarlyPR’s predictions with an accuracy of 70.8%. In summary, the proposed approach can potentially improve the efficiency of the fork-and-pull based OSS development by making accurate and early predictions of PR contributions from the distributed, and often independently, developed forks.

Fri 7 Mar

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
11:00
15m
Talk
AdvFusion: Adapter-based Knowledge Transfer for Code Summarization on Code Language ModelsBest Paper Award
Research Papers
Iman Saberi University of British Columbia Okanagan, Amirreza Esmaeili University of British Columbia, Fatemeh Hendijani Fard University of British Columbia, Chen Fuxiang University of Leicester
11:15
15m
Talk
EarlyPR: Early Prediction of Potential Pull-Requests from Forks
Research Papers
XiangChen Wu , Liang Wang Nanjing University, Xianping Tao Nanjing University
11:30
15m
Talk
The Hidden Challenges of Merging: A Tool-Based Exploration
Research Papers
Luciana Gomes UFCG, Melina Mongiovi Federal University of Campina Grande, Brazil, Sabrina Souto UEPB, Everton L. G. Alves Federal University of Campina Grande
11:45
7m
Talk
On the Performance of Large Language Models for Code Change Intent Classification
Early Research Achievement (ERA) Track
Issam Oukay Department of Software and IT Engineering, ETS Montreal, University of Quebec, Montreal, Canada, Moataz Chouchen Department of Electrical and Computer Engineering, Concordia University, Montreal, Canada, Ali Ouni ETS Montreal, University of Quebec, Fatemeh Hendijani Fard University of British Columbia
11:52
15m
Talk
Revisiting Method-Level Change Prediction: Comparative Evaluation at Different Granularities
Reproducibility Studies and Negative Results (RENE) Track
Hiroto Sugimori School of Computing, Institute of Science Tokyo, Shinpei Hayashi Institute of Science Tokyo
DOI Pre-print
:
:
:
: