Mining Pull Requests to Detect Process Anomalies in Open Source Software Development
Trustworthy Open Source Software (OSS) development processes are the basis that secures the long-term trustworthiness of software projects and products. With the aim to investigate the trustworthiness of the Pull Request (PR) process, the common model of collaborative development in OSS community, we exploit process mining to identify and analyze the normal and anomalous patterns of PR processes, and propose our approach to identifying anomalies from both control-flow and semantic aspects, and then to analyze and synthesize the root causes of the identified anomalies. We analyze 17531 PRs of 18 OSS projects on GitHub, extracting 26 root causes of control-flow anomalies and 19 root causes of semantic anomalies. We find that most PRs can hardly contain both semantic anomalies and control-flow anomalies, and the internal custom rules in projects may be the key causes for the identified anomalous PRs. We further discover and analyze the patterns of normal PR processes. We find that PRs in the non-fork model (42%) are far more likely than the fork model (5%) to bypass the review process, indicating a higher potential risk. Besides, we analyzed nine poisoned projects whose PR practices were indeed worse. Given the complex and diverse PR processes in OSS community, the proposed approach can help identify and understand not only anomalous PRs but also normal PRs, which offers early risk indications of suspicious incidents (such as poisoning) to OSS supply chain.
Thu 18 AprDisplayed time zone: Lisbon change
14:00 - 15:30 | Human and Social 5Software Engineering in Society / Journal-first Papers / New Ideas and Emerging Results / Software Engineering Education and Training / Research Track at Almada Negreiros Chair(s): Alexander Serebrenik Eindhoven University of Technology | ||
14:00 15mTalk | High Expectations: An Observational Study of Programming and Cannabis Intoxication Research Track Wenxin He University of Michigan, Manasvi Parikh University of Michigan, Westley Weimer University of Michigan, Madeline Endres University of Michgain DOI Pre-print | ||
14:15 15mTalk | Mining Pull Requests to Detect Process Anomalies in Open Source Software Development Research Track Bohan Liu Nanjing University, He Zhang Nanjing University, Weigang Ma Nanjing University, Hongyu Kuang Nanjing University, Yi Yang National University of Defense Technology, Jinwei Xu Nanjing University, Shan Gao Huawei, Jian Gao Huawei | ||
14:30 15mTalk | Video-based Training for Meeting Communication Skills Software Engineering Education and Training Matthias Galster University of Canterbury, Antonija Mitrovic University of Canterbury, Sanna Malinen University of Canterbury, Sreedevi Sankara Iyer University of Canterbury, Ja'afaru Musa University of Canterbury, Jay Holland University of Canterbury | ||
14:45 15mTalk | Impostor Phenomenon in Software Engineers Software Engineering in Society Paloma Guenes Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Rafael Tomaz Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Marcos Kalinowski Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Maria Teresa Baldassarre Department of Computer Science, University of Bari , Margaret-Anne Storey University of Victoria DOI Pre-print Media Attached | ||
15:00 7mTalk | An Empirical Comparison of Ethnic and Gender Diversity of DevOps and non-DevOps Contributions to Open-Source Projects Journal-first Papers Nimmi Rashinika Weeraddana University of Waterloo, Xiaoyan Xu University of Waterloo, Mahmoud Alfadel University of Waterloo, Shane McIntosh University of Waterloo, Mei Nagappan University of Waterloo Link to publication Pre-print | ||
15:07 7mTalk | Understanding Developers Well-Being and Productivity: a 2-year Longitudinal Analysis during the COVID-19 Pandemic Journal-first Papers Daniel Russo Department of Computer Science, Aalborg University, Paul Hanel University of Essex, Niels van Berkel Aalborg University DOI Pre-print | ||
15:14 7mTalk | Decomposing and Measuring Trust in Open-Source Software Supply Chains New Ideas and Emerging Results Lina Boughton The College of Wooster, Courtney Miller Carnegie Mellon University, Yasemin Acar Max Planck Institute for Security and Privacy, Dominik Wermke North Carolina State University, Christian Kästner Carnegie Mellon University |