Comparative Study of Reinforcement Learning in GitHub Pull Request Outcome Predictions (SANER 2024 - Research Papers) - SANER 2024

Tue 12 - Fri 15 March 2024 Rovaniemi , Finland

Who

Rinkesh Joshi, Nafiseh Kahani

Track

SANER 2024 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Athens.

Use conference time zone: (GMT+02:00) AthensSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Thu 14 Mar 2024 14:45 - 15:00 at LAPPI - Empirical Studies Chair(s): Valentina Lenarduzzi

Abstract

In the rapidly evolving field of software development, pull-based development models, facilitated by tools such as GitHub, are essential for collaboration. This study explores factors that influence pull request (PR) outcomes and employs two Reinforcement Learning (RL) formalizations, modeled as Markov Decision Processes, for PR outcome prediction. The first model leverages 72 PR features and achieves a G-mean score of 0.82664, while the second focuses solely on PR discussions, resulting in a G-mean of 0.88372. Using a specially designed reward function, these RL formalizations strategically address data imbalance and excel in mimicking both single-stage and multi-stage PR review processes. They outperform baseline models (Random Forest, XGBoost, and a Naive Bayes baseline) across various data splits—namely 80/20, 50/50, and 20/80—and are particularly effective at predicting PR rejections. The study also makes its datasets publicly available for future research.

Rinkesh Joshi

Carleton University

Canada

Nafiseh Kahani

Carleton University

Canada

Time Zone

The program is currently displayed in (GMT+02:00) Athens.

Use conference time zone: (GMT+02:00) AthensSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Thu 14 Mar
Displayed time zone: Athens change

	14:00 - 15:30	Empirical StudiesResearch Papers at LAPPI Chair(s): Valentina Lenarduzzi University of Oulu

	14:00 15m Talk		Exploring Markers and Drivers of Gender Bias in Machine Translations Research Papers Peter Barclay Edinburgh Napier University, Ashkan Sami Edinburgh Napier University Pre-print
	14:15 15m Talk		Delving into Parameter-Efficient Fine-Tuning in Code Change Learning: An Empirical Study Research Papers Shuo Liu City University of Hong Kong, Jacky Keung City University of Hong Kong, Zhen Yang Shandong University, Fang Liu Beihang University, Qilin Zhou City University of Hong Kong, Yihan Liao City University of Hong Kong
	14:30 15m Talk		Catch the Butterfly: Peeking into the Terms and Conflicts among SPDX Licenses Research Papers Liu Tao , Chengwei Liu Nanyang Technological University, Tianwei Liu School of Cyber Engineering, Xidian University, He Wang School of Cyber Engineering, Xidian University, Gaofei Wu School of Cyber Engineering, Xidian University, Yang Liu Nanyang Technological University, Yuqing Zhang University of Chinese Academy of Sciences; Zhongguancun Laboratory
	14:45 15m Talk		Comparative Study of Reinforcement Learning in GitHub Pull Request Outcome Predictions Research Papers Rinkesh Joshi Carleton University, Nafiseh Kahani Carleton University
	15:00 15m Talk		On the Usefulness of Python Structural Pattern Matching: An Empirical Study Research Papers Norbert Vándor University of Szeged, Gabor Antal University of Szeged, Peter Hegedus University of Szeged, Rudolf Ferenc University of Szeged
	15:15 15m Talk		Deep Learning Model Reuse in the HuggingFace Community: Challenges, Benefit and Trends Research Papers Mina Taraghi Polytechnique Montréal, Gianolli Dorcelus Polytechnique Montréal, Armstrong Tita Foundjem Ecole Polytechnique de Montreal, Florian Tambon Polytechnique Montréal, Foutse Khomh Polytechnique Montréal Pre-print

:

:

:

: