APSEC 2024
Tue 3 - Fri 6 December 2024 China

This program is tentative and subject to change.

Wed 4 Dec 2024 15:00 - 15:20 at Room 4 (Xiangquan Ballroom) - Session (4)

As pull-based software development has become popular, collecting pull requests is frequent in many empirical studies. Although researchers can utilize publicly available datasets, the on-demand collection of PR data is indispensable to compensate for missing information or obtain the latest information. Unfortunately, PR data collected through the GitHub API sometimes has a deficiency in which parts of the data are lost. This data loss would be trouble for researchers in their data analysis. To reveal what data related to PRs tends to be lost during their collections using GitHub API, we conducted a study with 12,118 pull requests in six repositories of OSS projects on GitHub. In the study, we clarified the PR data that needs to be obtained through the GitHub API by defining their entities as features and attributes. We also collected data losses and classified them by checking the lost attributes based on exception reports triggered during PR collection. The collected data losses were categorized into seven. The paper shows our study results that more than half of the PRs (about 53%) involve data loss in total, which may be surprising for many researchers. The paper also discusses the possible causes of data losses, which helps researchers filter out deficient PRs during the collection.

This program is tentative and subject to change.

Wed 4 Dec

Displayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change

14:00 - 15:30
14:00
30m
Talk
An Empirical Study of Cross-Project Pull Request Recommendation in GitHub
Technical Track
Wenyu Xu national university of defense technology, Yao Lu National University of Defense Technology, Xunhui Zhang National University of Defense Technology, China, Tanghaoran Zhang national university of defense technology, Xinjun Mao National University of Defense Technology, Bo Lin National University of Defense Technology
14:30
30m
Talk
FRELinker: A Novel Issue-Commit Link Recovery Model Based on Feature Refinement and Expansion with Multi-Classifier Fusion
Technical Track
Bangchao Wang Wuhan Textile University, Xinyu He School of Computer Science and Artificial Intelligence, Wuhan Textile University, Hongyan Wan Wuhan Textile University, Xiaoxiao Li School of Computer Science and Artificial Intelligence, Wuhan Textile University, Jiaxu Zhu School of Computer Science and Artificial Intelligence, Wuhan Textile University, Yukun Cao School of Computer Science and Artificial Intelligence, Wuhan Textile University
15:00
20m
Talk
Towards Filtering Out Deficient Pull Requests Collected through the GitHub API
ERA - Early Research Achievements
Bowen Tang Ritsumeikan University, Xiqin Lu Ritsumeikan University, Katsuhisa Maruyama Ritsumeikan University