Write a Blog >>
MSR 2022
Mon 23 - Tue 24 May 2022
co-located with ICSE 2022
Tue 17 May 2022 22:22 - 22:29 at MSR Main room - even hours - Session 1 Chair(s): Hongyu Zhang, Masud Rahman

Bots have become popular in software projects as they play critical roles, from running tests to fixing bugs/vulnerabilities. However, the large number of software bots adds extra effort on practitioners and researchers to distinguish human accounts from bot accounts to avoid bias in data-driven studies. Researchers developed several approaches to identify bots at specific activity levels (issue/pull request or commit), considering a single repository, and disregarding features that were shown to be effective in other domains. To address this gap, we propose using a machine learning based approach to identify the bot accounts regardless of their activity level. We extracted 19 features related to the account’s profile information, activities, and comment similarity. Then, we evaluated the performance of five machine learning classifiers using a dataset that has more than 5,000 GitHub accounts. Our results show that the Random Forest classifier performs the best with an F1-score of 92.4% and AUC of 98.7%. Furthermore, the account profile information (e.g., account login) are the most important features to identify the account type. Finally, we compare the performance of the Random Forest classifier to the state-of-the-art approaches, and our results show that our Random Forest model outperforms the state-of-the-art techniques in identifying the account types regardless of their activity level.

Tue 17 May

Displayed time zone: Eastern Time (US & Canada) change

22:00 - 22:50
Session 1Technical Papers / Registered Reports at MSR Main room - even hours
Chair(s): Hongyu Zhang University of Newcastle, Masud Rahman Dalhousie University
22:00
4m
Short-paper
An Empirical Evaluation of GitHub Copilot’s Code Suggestions
Technical Papers
Nhan Nguyen University of Alberta, Sarah Nadi University of Alberta
DOI Pre-print
22:04
4m
Short-paper
Comments on Comments: Where Code Review and Documentation Meet
Technical Papers
Nikitha Rao Carnegie Mellon University, Jason Tsay IBM Research, Martin Hirzel IBM Research, Vincent J. Hellendoorn Carnegie Mellon University
DOI Pre-print File Attached
22:08
7m
Talk
Does This Apply to Me? An Empirical Study of Technical Context in Stack Overflow
Technical Papers
Akalanka Galappaththi University of Alberta, Sarah Nadi University of Alberta, Christoph Treude University of Melbourne
DOI Pre-print Media Attached
22:15
7m
Talk
Towards Reliable Agile Iterative Planning via Predicting Documentation Changes of Work Items
Technical Papers
Jirat Pasuksmit University of Melbourne, Patanamon Thongtanunam University of Melbourne, Shanika Karunasekera The University of Melbourne
22:22
7m
Talk
BotHunter: An Approach to Detect Software Bots in GitHub
Technical Papers
Ahmad Abdellatif Concordia University, Mairieli Wessel Delft University of Technology, Igor Steinmacher Northern Arizona University, Marco Gerosa Northern Arizona University, USA, Emad Shihab Concordia University
Pre-print
22:29
7m
Talk
Recommending Code Improvements Based on Stack Overflow Answer Edits
Registered Reports
Chaiyong Ragkhitwetsagul Mahidol University, Thailand, Matheus Paixao University of Fortaleza
Pre-print
22:36
14m
Live Q&A
Discussions and Q&A
Technical Papers


Information for Participants
Tue 17 May 2022 22:00 - 22:50 at MSR Main room - even hours - Session 1 Chair(s): Hongyu Zhang, Masud Rahman
Info for room MSR Main room - even hours:

Click here to go to the room on Midspace