Write a Blog >>
MSR 2022
Mon 23 - Tue 24 May 2022
co-located with ICSE 2022
Dates
Tracks
You're viewing the program in a time zone which is different from your device's time zone change time zone

Tue 17 May

Displayed time zone: Eastern Time (US & Canada) change

21:00 - 21:50
Newcomer Orientation ITechnical Papers / Shadow PC / Hackathon / FOSS Award / MSR Awards / Mining Challenge / Registered Reports / Keynotes / Industry Track / MIP Award / Tutorials / Vision and Reflection / Data and Tool Showcase Track at MSR Newcomer Orientation room
Chair(s): Yuan Tian Queens University, Kingston, Canada, Gias Uddin University of Calgary, Canada

Mentors: Bram Adams, Fatemeh Fard, Li Li, Ali Ouni, Tianyi Zhang

22:00 - 22:50
Session 1Technical Papers / Registered Reports at MSR Main room - even hours
Chair(s): Hongyu Zhang University of Newcastle, Masud Rahman Dalhousie University
22:00
4m
Short-paper
An Empirical Evaluation of GitHub Copilot’s Code Suggestions
Technical Papers
Nhan Nguyen University of Alberta, Sarah Nadi University of Alberta
DOI Pre-print
22:04
4m
Short-paper
Comments on Comments: Where Code Review and Documentation Meet
Technical Papers
Nikitha Rao Carnegie Mellon University, Jason Tsay IBM Research, Martin Hirzel IBM Research, Vincent J. Hellendoorn Carnegie Mellon University
DOI Pre-print File Attached
22:08
7m
Talk
Does This Apply to Me? An Empirical Study of Technical Context in Stack Overflow
Technical Papers
Akalanka Galappaththi University of Alberta, Sarah Nadi University of Alberta, Christoph Treude University of Melbourne
DOI Pre-print Media Attached
22:15
7m
Talk
Towards Reliable Agile Iterative Planning via Predicting Documentation Changes of Work Items
Technical Papers
Jirat Pasuksmit University of Melbourne, Patanamon Thongtanunam University of Melbourne, Shanika Karunasekera The University of Melbourne
22:22
7m
Talk
BotHunter: An Approach to Detect Software Bots in GitHub
Technical Papers
Ahmad Abdellatif Concordia University, Mairieli Wessel Delft University of Technology, Igor Steinmacher Northern Arizona University, Marco Gerosa Northern Arizona University, USA, Emad Shihab Concordia University
Pre-print
22:29
7m
Talk
Recommending Code Improvements Based on Stack Overflow Answer Edits
Registered Reports
Chaiyong Ragkhitwetsagul Mahidol University, Thailand, Matheus Paixao University of Fortaleza
Pre-print
22:36
14m
Live Q&A
Discussions and Q&A
Technical Papers

Wed 18 May

Displayed time zone: Eastern Time (US & Canada) change

03:00 - 03:50
03:00
4m
Talk
An Alternative Issue Tracking Dataset of Public Jira Repositories
Data and Tool Showcase Track
Lloyd Montgomery Universität Hamburg, Clara Marie Lüders University of Hamburg, Walid Maalej University of Hamburg
Pre-print Media Attached
03:04
7m
Talk
Smelly Variables in Ansible Infrastructure Code: Detection, Prevalence, and Lifetime
Technical Papers
Ruben Opdebeeck Vrije Universiteit Brussel, Ahmed Zerouali Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel
Pre-print
03:11
7m
Talk
Beyond Duplicates: Towards Understanding and Predicting Link Types in Issue Tracking Systems
Technical Papers
Clara Marie Lüders University of Hamburg, Abir Bouraffa University of Hamburg, Walid Maalej University of Hamburg
DOI Pre-print
03:18
7m
Talk
Real-World Clone-Detection in Go
Industry Track
Qinyun Wu Bytedance Ltd., Huan Song Bytedance Ltd., Ping Yang Bytedance Network Technology
03:25
4m
Talk
Towards Using Gameplay Videos for Detecting Issues in Video Games
Registered Reports
Emanuela Guglielmi University of Molise, Simone Scalabrino University of Molise, Gabriele Bavota Software Institute, USI Università della Svizzera italiana, Rocco Oliveto University of Molise
Pre-print
03:29
4m
Talk
Is Surprisal in Issue Trackers Actionable?
Registered Reports
James Caddy University of Adelaide, Markus Wagner University of Adelaide, Australia, Christoph Treude University of Melbourne, Earl T. Barr University College London, UK, Miltiadis Allamanis Microsoft Research
DOI Pre-print Media Attached
03:33
17m
Live Q&A
Discussions and Q&A
Technical Papers

04:00 - 04:50
Newcomer Orientation IITechnical Papers at MSR Newcomer Orientation room
Chair(s): Tegawendé F. Bissyandé SnT, University of Luxembourg, Chaiyong Rakhitwetsagul Mahidol University, Thailand

Mentors: Bodin Chinthanet, Raula Gaikovina Kula, Christoph Treude, Xin Xia

05:00 - 05:50
Session 3: Introspection, Vision, and Human Aspects Technical Papers / Data and Tool Showcase Track / Industry Track / Registered Reports at MSR Main room - odd hours
Chair(s): Alexander Serebrenik Eindhoven University of Technology, Sebastian Baltes SAP SE & University of Adelaide
05:00
4m
Short-paper
Geographic Diversity in Public Code Contributions
Technical Papers
Davide Rossi University of Bologna, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
Pre-print Media Attached
05:04
7m
Talk
Operationalizing Threats to MSR Studies by Simulation-Based TestingDistinguished Paper Award
Technical Papers
Johannes Härtel University of Koblenz-Landau, Germany, Ralf Laemmel Facebook London
Pre-print Media Attached
05:11
4m
Talk
The General Index of Software Engineering Papers
Data and Tool Showcase Track
Zeinab Abou Khalil Inria, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
DOI Pre-print
05:15
7m
Talk
Challenges and Future Research Direction for Microtask Programming in Industry
Industry Track
Masanari Kondo Kyushu University, Shinobu Saito NTT, IIMURA Yukako NTT, Eunjong Choi Kyoto Institute of Technology, Osamu Mizuno Kyoto Institute of Technology, Yasutaka Kamei Kyushu University, Naoyasu Ubayashi Kyushu University
DOI Pre-print Media Attached
05:22
7m
Talk
Starting the InnerSource Journey: Key Goals and Metrics to Measure Collaboration
Industry Track
Daniel Izquierdo-Cortazar Bitergia, Jesús Alonso-Gutiérrez Santander Bank, Alberto Pérez García-Plaza Bitergia, Gregorio Robles Universidad Rey Juan Carlos, Jesus M. Gonzalez-Barahona Universidad Rey Juan Carlos
Pre-print Media Attached
05:29
4m
Talk
Investigating the Impact of Forgetting in Software Development
Registered Reports
Utku Unal METU, Eray Tüzün Bilkent University, Tamer Gezici Bilkent University, Ausaf Ahmed Farooqui Bilkent University
Pre-print
05:33
17m
Live Q&A
Discussions and Q&A
Technical Papers

11:00 - 11:50
Keynote: Christian Kästner – From Models to Systems: Rethinking the Role of Software Engineering for Machine LearningTechnical Papers at MSR Plenary room
Chair(s): Nicole Novielli University of Bari
12:00 - 12:50
12:00
4m
Talk
An Exploratory Study on Refactoring Documentation in Issues Handling
Mining Challenge
Eman Abdullah AlOmar Stevens Institute of Technology, Anthony Peruma Rochester Institute of Technology, Mohamed Wiem Mkaouer Rochester Institute of Technology, Christian D. Newman Rochester Institute of Technology, Ali Ouni ETS Montreal, University of Quebec
Pre-print
12:04
4m
Talk
Between JIRA and GitHub: ASFBot and its Influence on Human Comments in Issue Trackers
Mining Challenge
Ambarish Moharil Eindhoven University of Technology, Dmitrii Orlov Eindhoven University of Technology, Samar Jameel Eindhoven University of Technology, Tristan Trouwen Eindhoven University of Technology, Nathan Cassee Eindhoven University of Technology, Alexander Serebrenik Eindhoven University of Technology
Pre-print
12:08
4m
Talk
Is Refactoring Always a Good Egg? Exploring the Interconnection Between Bugs and Refactorings
Mining Challenge
Amirreza Bagheri University of Szeged, Peter Hegedus University of Szeged
File Attached
12:12
4m
Talk
On the Co-Occurrence of Refactoring of Test and Source Code
Mining Challenge
Nicholas Nagy Concordia University, Rabe Abdalkareem Carleton University
Pre-print Media Attached
12:16
4m
Talk
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship Between Technical Debt and RefactoringBest Mining Challenge Paper Award
Mining Challenge
Anthony Peruma Rochester Institute of Technology, Eman Abdullah AlOmar Stevens Institute of Technology, Christian D. Newman Rochester Institute of Technology, Mohamed Wiem Mkaouer Rochester Institute of Technology, Ali Ouni ETS Montreal, University of Quebec
Pre-print Media Attached
12:20
4m
Talk
Studying the Impact of Continuous Delivery Adoption on Bug-Fixing Time in Apache’s Open-Source Projects
Mining Challenge
Carlos Diego Andrade de Almeida Federal University of Ceará, Diego N. Feijó Federal University of Ceará, Lincoln Souza Rocha Federal University of Ceará
Media Attached
12:24
4m
Talk
Which bugs are missed in code reviews: An empirical study on SmartSHARK dataset
Mining Challenge
fatemeh khoshnoud Department of Computer Science and Engineering and IT; School of Electrical and Computer Engineering, Shiraz University, Ali Rezaei Nasab Department of Computer Science and Engineering and IT; School of Electrical and Computer Engineering, Shiraz University, Zahra Toudeji Department of Computer Science and Engineering and IT; School of Electrical and Computer Engineering, Shiraz University, Ashkan Sami Shiraz University
12:28
22m
Live Q&A
Discussions and Q&A
Technical Papers

13:00 - 13:50
Session 4: Software Quality (Bugs & Smells)Data and Tool Showcase Track / Technical Papers at MSR Main room - odd hours
Chair(s): Maxime Lamothe Polytechnique Montreal, Montreal, Canada, Mahmoud Alfadel University of Waterloo
13:00
7m
Talk
Dazzle: Using Optimized Generative Adversarial Networks to Address Security Data Class Imbalance Issue
Technical Papers
Rui Shu North Carolina State University, Tianpei Xia North Carolina State University, Laurie Williams North Carolina State University, Tim Menzies North Carolina State University
13:07
7m
Talk
To What Extent do Deep Learning-based Code Recommenders Generate Predictions by Cloning Code from the Training Set?
Technical Papers
Matteo Ciniselli Università della Svizzera Italiana, Luca Pascarella Università della Svizzera italiana (USI), Gabriele Bavota Software Institute, USI Università della Svizzera italiana
Pre-print
13:14
7m
Talk
How to Improve Deep Learning for Software Analytics (a case study with code smell detection)
Technical Papers
Rahul Yedida , Tim Menzies North Carolina State University
Pre-print
13:21
7m
Talk
Using Active Learning to Find High-Fidelity Builds
Technical Papers
Harshitha Menon Lawrence Livermore National Lab, Konstantinos Parasyris Lawrence Livermore National Laboratory, Todd Gamblin Lawrence Livermore National Laboratory, Tom Scogland Lawrence Livermore National Laboratory
Pre-print
13:28
4m
Talk
ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction
Data and Tool Showcase Track
Hossein Keshavarz David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada, Mei Nagappan University of Waterloo
Pre-print
13:32
4m
Talk
ReCover: a Curated Dataset for Regression Testing Research
Data and Tool Showcase Track
Francesco Altiero Università degli Studi di Napoli Federico II, Anna Corazza Università degli Studi di Napoli Federico II, Sergio Di Martino Università degli Studi di Napoli Federico II, Adriano Peron Università degli Studi di Napoli Federico II, Luigi Libero Lucio Starace Università degli Studi di Napoli Federico II
13:36
14m
Live Q&A
Discussions and Q&A
Technical Papers

13:00 - 13:50
Tutorial: Empirical Standards for Repository MiningTutorials at MSR Tutorials room
13:00
50m
Tutorial
Empirical Standards for Repository Mining
Tutorials
Paul Ralph Dalhousie University, Tushar Sharma Dalhousie University, Preetha Chatterjee Drexel University, USA
Pre-print
14:00 - 14:50
Session 5: Communication & Domains Data and Tool Showcase Track / Technical Papers at MSR Main room - even hours
Chair(s): Masud Rahman Dalhousie University, Mahmoud Alfadel University of Waterloo
14:00
7m
Talk
Painting the Landscape of Automotive Software in GitHub
Technical Papers
Sangeeth Kochanthara Eindhoven University of Technology, Yanja Dajsuren Eindhoven University of Technology, Loek Cleophas Eindhoven University of Technology (TU/e) and Stellenbosch University (SU), Mark van den Brand Eindhoven University of Technology
Pre-print Media Attached
14:07
7m
Full-paper
Mining the Usage of Reactive Programming APIs: A Study on GitHub and Stack Overflow
Technical Papers
Carlos Zimmerle Federal University of Pernambuco, Kiev Gama Federal University of Pernambuco, Fernando Castor Utrecht University & Federal University of Pernambuco, José Murilo Filho Federal University of Pernambuco
DOI Pre-print
14:14
4m
Talk
SoCCMiner: A Source Code-Comments and Comment-Context Miner
Data and Tool Showcase Track
Murali Sridharan University of Oulu, Mika Mäntylä University of Oulu, Maëlick Claes University of Oulu, Leevi Rantala University of Oulu
Pre-print
14:18
4m
Talk
SLNET: A Redistributable Corpus of 3rd-party Simulink Models
Data and Tool Showcase Track
Sohil Lal Shrestha The University of Texas at Arlington, Shafiul Azam Chowdhury University of Texas at Arlington, Christoph Csallner University of Texas at Arlington
DOI Pre-print Media Attached
14:22
4m
Talk
SOSum: A Dataset of Stack Overflow Post Summaries
Data and Tool Showcase Track
Bonan Kou Purdue University, Yifeng Di Purdue University, Muhao Chen University of Southern California, Tianyi Zhang Purdue University
14:26
4m
Talk
Inspect4py: A Knowledge Extraction Framework for Python Code Repositories
Data and Tool Showcase Track
Rosa Filgueira St. Andrews University, Daniel Garijo Universidad Politécnica de Madrid
14:30
4m
Talk
DISCO: A Dataset of Discord Chat Conversations for Software Engineering Research
Data and Tool Showcase Track
Keerthana Muthu Subash Carleton University, Canada, Lakshmi Prasanna Kumar Carleton University, Canada, Sri Lakshmi Vadlamani Carleton University, Canada, Preetha Chatterjee Drexel University, USA, Olga Baysal Carleton University
DOI Pre-print Media Attached
14:34
16m
Live Q&A
Discussions and Q&A
Technical Papers

14:00 - 14:50
Tutorial: Mining the Ethereum Blockchain PlatformTutorials at MSR Tutorials room
14:00
50m
Tutorial
Mining the Ethereum Blockchain Platform: Best Practices and Pitfalls
Tutorials
Gustavo A. Oliva Queen's University
20:00 - 20:50
Session 6: Maintenance & TestingData and Tool Showcase Track / Technical Papers at MSR Main room - even hours
Chair(s): Ajay Jha University of Alberta, Amjed Tahir Massey University
20:00
4m
Short-paper
Characterizing High-Quality Test Methods: A First Empirical Study
Technical Papers
Pre-print
20:04
7m
Talk
CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot transfer learning
Technical Papers
Mohammad Reza Taesiri University of Alberta, Finlay Macklon University of Alberta, Cor-Paul Bezemer University of Alberta
20:11
7m
Talk
An Empirical Study on Maintainable Method Size in Java
Technical Papers
Shaiful Chowdhury University of Alberta, Gias Uddin University of Calgary, Canada, Reid Holmes University of British Columbia
20:18
7m
Talk
Complex Python Features in the Wild
Technical Papers
Yi Yang Rensselaer Polytechnic Institute, Ana Milanova Rensselaer Polytechnic Institute, Martin Hirzel IBM Research
20:25
4m
Talk
Methods2Test: A dataset of focal methods mapped to test cases
Data and Tool Showcase Track
Michele Tufano Microsoft, Shao Kun Deng Microsoft Corporation, Neel Sundaresan Microsoft Corporation, Alexey Svyatkovskiy
20:29
4m
Talk
npm-filter: Automating the mining of dynamic information from npm packages
Data and Tool Showcase Track
Ellen Arteca Northeastern University, Alexi Turcotte Northeastern University
Pre-print Media Attached
20:33
4m
Talk
ManyTypes4TypeScript: A Comprehensive TypeScript Dataset for Sequence-Based Type Inference
Data and Tool Showcase Track
Kevin Jesse University of California, Davis, Prem Devanbu Department of Computer Science, University of California, Davis
DOI Pre-print
20:37
13m
Live Q&A
Discussions and Q&A
Technical Papers

21:00 - 21:50
Session 7: Developer Wellbeing & Project CommunicationTechnical Papers / Data and Tool Showcase Track / Industry Track at MSR Main room - odd hours
Chair(s): Bram Adams Queen's University, Kingston, Ontario
21:00
7m
Talk
On the Violation of Honesty in Mobile Apps: Automated Detection and CategoriesDistinguished Paper Award
Technical Papers
Humphrey Obie Monash University, Idowu Oselumhe Ilekura Data Science Nigeria, Hung Du Applied Artificial Intelligence Institute, Deakin University, Mojtaba Shahin RMIT University, Australia, John Grundy Monash University, Li Li Monash University, Jon Whittle CSIRO's Data61 and Monash University, Burak Turhan University of Oulu
Pre-print
21:07
7m
Talk
How heated is it? Understanding GitHub locked issues
Technical Papers
Isabella Ferreira Polytechnique Montréal, Bram Adams Queen's University, Kingston, Ontario, Jinghui Cheng Polytechnique Montreal
Pre-print Media Attached
21:14
4m
Talk
The OCEAN mailing list data set: Network analysis spanning mailing lists and code repositories
Data and Tool Showcase Track
Melanie Warrick University of Vermont, Samuel F. Rosenblatt University of Vermont, Jean-Gabriel Young University of Vermont, amanda casari Open Source Programs Office, Google, Laurent Hébert-Dufresne University of Vermont, James P. Bagrow University of Vermont
DOI Pre-print Media Attached
21:18
4m
Talk
The Unexplored Treasure Trove of Phabricator Code Reviews
Data and Tool Showcase Track
Gunnar Kudrjavets University of Groningen, Nachiappan Nagappan Microsoft Research, Ayushi Rastogi University of Groningen, The Netherlands
DOI Pre-print
21:22
4m
Talk
The Unsolvable Problem or the Unheard Answer? A Dataset of 24,669 Open-Source Software Conference Talks
Data and Tool Showcase Track
Kimberly Truong Oregon State University, Courtney Miller Carnegie Mellon University, Bogdan Vasilescu Carnegie Mellon University, USA, Christian Kästner Carnegie Mellon University
DOI Pre-print
21:26
4m
Talk
Exploring Apache Incubator Project Trajectories with APEX
Data and Tool Showcase Track
Anirudh Ramchandran University of California, Davis, Likang Yin University of California, Davis, Vladimir Filkov University of California at Davis
21:30
7m
Talk
A Culture of Productivity: Maximizing Productivity by Maximizing Wellbeing
Industry Track
Brian Houck Microsoft Research
21:37
13m
Live Q&A
Discussions and Q&A
Technical Papers

Thu 19 May

Displayed time zone: Eastern Time (US & Canada) change

03:00 - 03:50
Session 8: Large-Scale Mining & Software EcosystemsTechnical Papers / Data and Tool Showcase Track at MSR Main room - odd hours
Chair(s): Fiorella Zampetti University of Sannio, Italy, Gregorio Robles Universidad Rey Juan Carlos
03:00
7m
Talk
An Empirical Study on the Survival Rate of GitHub Projects
Technical Papers
Adem Ait-Fonolla IN3 - UOC, Javier Luis Cánovas Izquierdo IN3 - UOC, Jordi Cabot Open University of Catalonia, Spain
Pre-print
03:07
7m
Talk
A Large-Scale Comparison of Python Code in Jupyter Notebooks and ScriptsDistinguished Paper Award
Technical Papers
Konstantin Grotov JetBrains Research, ITMO University, Sergey Titov JetBrains Research, Vladimir Sotnikov JetBrains Research, Yaroslav Golubev JetBrains Research, Timofey Bryksin JetBrains Research; HSE University
DOI Pre-print
03:14
7m
Talk
Do Customized Android Frameworks Keep Pace with Android?
Technical Papers
Pei Liu Monash University, Mattia Fazzini University of Minnesota, John Grundy Monash University, Li Li Monash University
03:21
4m
Talk
Lupa: A Platform for Large Scale Analysis of The Progamming Language Usage
Data and Tool Showcase Track
Anna Vlasova JetBrains Research, Maria Tigina JetBrains Research, ITMO University, Ilya Vlasov Saint Petersburg State University, Anastasiia Birillo JetBrains Research, Yaroslav Golubev JetBrains Research, Timofey Bryksin JetBrains Research; HSE University
DOI Pre-print
03:25
4m
Talk
GitDelver Enterprise Dataset (GDED): An Industrial Closed-source Dataset for Socio-Technical Research
Data and Tool Showcase Track
Nicolas Riquet University of Namur, Xavier Devroey University of Namur, Benoît Vanderose University of Namur
Pre-print
03:29
4m
Talk
DaSEA – A Dataset for Software Ecosystem Analysis
Data and Tool Showcase Track
Petya Buchkova IT University of Copenhagen, Joakim Hey Hinnerskov IT University of Copenhagen, Kasper Olsen IT University of Copenhagen, Rolf-Helge Pfeiffer IT University of Copenhagen
Pre-print Media Attached
03:33
4m
Talk
Dataset: Dependency Networks of Open Source Libraries Available Through CocoaPods, Carthage and Swift PM
Data and Tool Showcase Track
Kristiina Rahkema University of Tartu, Dietmar Pfahl University of Tartu
Pre-print Media Attached
03:37
13m
Live Q&A
Discussions and Q&A
Technical Papers

04:00 - 04:50
Session 9: Scaling & CloudIndustry Track / Registered Reports / Data and Tool Showcase Track / Technical Papers at MSR Main room - even hours
Chair(s): Lwin Khin Shar Singapore Management University
04:00
4m
Talk
SniP: An Efficient Stack Tracing Framework for Multi-threaded Programs
Data and Tool Showcase Track
Arun KP Indian Institute of Technology Kanpur, Saurabh Kumar Indian Institute of Technology Kanpur, Debadatta Mishra , Biswabandan Panda Indian Institute of Technology Bombay
DOI Pre-print
04:04
4m
Talk
Tooling for Time- and Space-efficient git Repository Mining
Data and Tool Showcase Track
Fabian Heseding Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Willy Scheibel Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Jürgen Döllner Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam
04:08
4m
Talk
TSSB-3M: Mining single statement bugs at massive scale
Data and Tool Showcase Track
Cedric Richter Carl von Ossietzky Universität Oldenburg / University of Oldenburg, Heike Wehrheim Carl von Ossietzky Universität Oldenburg / University of Oldenburg
Pre-print Media Attached
04:12
7m
Talk
Improved Business Outcomes from Cloud Applications – using Integrated Process and Runtime Product Data Mining
Industry Track
Mahesh Venkataraman Accenture, Reuben George Accenture, Jeff Wilkinson Accenture
04:19
7m
Talk
Improve Quality of Cloud Serverless Architectures through Software Repository Mining
Industry Track
04:26
4m
Talk
Toward Granular Automatic Unit Test Case Generation
Registered Reports
Fabiano Pecorelli Tampere University, Giovanni Grano LocalStack, Fabio Palomba University of Salerno, Harald C. Gall University of Zurich, Andrea De Lucia University of Salerno
Pre-print
04:30
20m
Live Q&A
Discussions and Q&A
Technical Papers

05:00 - 05:50
05:00
4m
Short-paper
WeakSATD: detecting weak self-admitted technical debt
Technical Papers
Barbara Russo Free University of Bolzano, Matteo Camilli Free University of Bozen-Bolzano, Moritz Mock Free University of Bolzano
DOI Pre-print Media Attached
05:04
7m
Talk
LibDB: An Effective and Efficient Framework for Detecting Third-Party Libraries in Binaries
Technical Papers
Wei Tang Tsinghua University, Yanlin Wang Microsoft Research, Hongyu Zhang University of Newcastle, Shi Han Microsoft Research, Ping Luo Tsinghua University, Dongmei Zhang Microsoft Research
Pre-print
05:11
7m
Talk
Noisy Label Learning for Security Defects
Technical Papers
Roland Croft The University of Adelaide, Muhammad Ali Babar University of Adelaide, Huaming Chen The University of Adelaide
05:18
4m
Talk
Vul4J: A Dataset of Reproducible Java Vulnerabilities Geared Towards the Study of Program Repair TechniquesData and Tool Showcase Award
Data and Tool Showcase Track
Quang-Cuong Bui Hamburg University of Technology, Riccardo Scandariato Hamburg University of Technology, Nicolás E. Díaz Ferreyra Hamburg University of Technology
Pre-print Media Attached
05:22
4m
Talk
AndroOBFS: Time-tagged Obfuscated Android Malware Dataset with Family Information
Data and Tool Showcase Track
Saurabh Kumar Indian Institute of Technology Kanpur, Debadatta Mishra , Biswabandan Panda Indian Institute of Technology Bombay, Sandeep K. Shukla Indian Institute of Technology Kanpur
DOI Pre-print Media Attached
05:26
4m
Talk
TriggerZoo: A Dataset of Android Applications Automatically Infected with Logic Bombs
Data and Tool Showcase Track
Jordan Samhi University of Luxembourg, Tegawendé F. Bissyandé SnT, University of Luxembourg, Jacques Klein University of Luxembourg
DOI Pre-print Media Attached
05:30
4m
Talk
CamBench - Cryptographic API Misuse Detection Tool Benchmark Suite
Registered Reports
Michael Schlichtig Heinz Nixdorf Institute at Paderborn University, Anna-Katharina Wickert TU Darmstadt, Germany, Stefan Krüger Independent Researcher, Eric Bodden University of Paderborn; Fraunhofer IEM, Mira Mezini TU Darmstadt
Pre-print
05:34
16m
Live Q&A
Discussions and Q&A
Technical Papers

10:00 - 10:50
Virtual CoffeeTechnical Papers at MSR Main room - even hours

This session will be for informal conversations on Midspace.

11:00 - 11:50
Session 11: Machine Learning & Information RetrievalTechnical Papers at MSR Main room - odd hours
Chair(s): Phuong T. Nguyen University of L’Aquila
11:00
4m
Short-paper
On the Naturalness of Fuzzer Generated Code
Technical Papers
Rajeswari Hita Kambhamettu Carnegie Mellon University, John Billos Wake Forest University, Carolyn "Tomi" Oluwaseun-Apo Pennsylvania State University, Benjamin Gafford Carnegie Mellon University, Rohan Padhye Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University
11:04
7m
Talk
Does Configuration Encoding Matter in Learning Software Performance? An Empirical Study on Encoding Schemes
Technical Papers
Jingzhi Gong Loughborough University, Tao Chen Loughborough University
DOI Pre-print Media Attached
11:11
7m
Talk
Multimodal Recommendation of Messenger Channels
Technical Papers
Ekaterina Koshchenko JetBrains Research, Egor Klimov JetBrains Research, Vladimir Kovalenko JetBrains Research
11:18
7m
Talk
Senatus: A Fast and Accurate Code-to-Code Recommendation Engine
Technical Papers
Fran Silavong JP Morgan Chase & Co., Sean Moran JP Morgan Chase & Co., Antonios Georgiadis JP Morgan Chase & Co., Rohan Saphal JP Morgan Chase & Co., Robert Otter JP Morgan Chase & Co.
DOI Pre-print Media Attached
11:25
7m
Talk
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study
Technical Papers
Tatiana Castro Vélez City University of New York (CUNY) Graduate Center, Raffi Khatchadourian City University of New York (CUNY) Hunter College, Mehdi Bagherzadeh Oakland University, Anita Raja City University of New York (CUNY) Hunter College
Pre-print Media Attached
11:32
7m
Talk
GraphCode2Vec: Generic Code Embedding via Lexical and Program Dependence Analyses
Technical Papers
Wei Ma SnT, University of Luxembourg, Mengjie Zhao LMU Munich, Ezekiel Soremekun SnT, University of Luxembourg, Qiang Hu University of Luxembourg, Jie M. Zhang King's College London, Mike Papadakis University of Luxembourg, Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Xiaofei Xie Singapore Management University, Singapore, Yves Le Traon University of Luxembourg, Luxembourg
Pre-print
11:39
11m
Live Q&A
Discussions and Q&A
Technical Papers

12:00 - 12:51
Vision & Reflections Track: PastTechnical Papers at MSR Plenary room
Chair(s): Bram Adams Queen's University, Kingston, Ontario, Shaowei Wang University of Manitoba
12:00
10m
Talk
Back to the future: Empirical Revolution(s) in Software Engineering
Technical Papers
Audris Mockus The University of Tennessee
12:10
10m
Talk
Engineering the MSR Field and the Joy of Research
Technical Papers
Ahmed E. Hassan Queen's University
12:20
10m
Talk
It's all in your network: How mining developer collaboration allowed us to peer into complex socio-technical aspects of software development
Technical Papers
Daniela Damian University of Victoria
12:30
21m
Other
Discussion
Technical Papers

13:00 - 13:51
Vision & Reflections Track: FutureTechnical Papers at MSR Plenary room
Chair(s): Bram Adams Queen's University, Kingston, Ontario, Shaowei Wang University of Manitoba
13:00
10m
Talk
Bias in MSR research
Technical Papers
Alexander Serebrenik Eindhoven University of Technology
13:10
10m
Talk
The Next Generation of Software Developers
Technical Papers
Denae Ford Microsoft Research
13:20
10m
Talk
Mining Software Repositories in the age of AI
Technical Papers
Foutse Khomh Polytechnique Montréal
13:30
21m
Other
Discussion
Technical Papers

14:00 - 14:50
MIP Award SessionMIP Award at MSR Plenary room
Chair(s): Massimiliano Di Penta University of Sannio, Italy

Most Influential Paper: “GHTorrent: Github’s data from a firehose” by Georgios Gousios and Diomidis Spinellis (MSR 2012) for conceiving and maintaining the GHTorrent archive, extensively leveraged by the MSR community.

14:00
50m
Talk
MIP Award Talk
MIP Award
Georgios Gousios Endor Labs & Delft University of Technology, Diomidis Spinellis Athens University of Economics and Business; Delft University of Technology
20:00 - 20:50
Session 12: Integration & Large-Scale MiningTechnical Papers / Data and Tool Showcase Track at MSR Main room - even hours
Chair(s): Jin L.C. Guo McGill University, Amjed Tahir Massey University
20:00
4m
Short-paper
Is Open Source Eating the World’s Software? Measuring the Proportion of Open Source in proprietary software using Java Binaries
Technical Papers
Julius Musseau Mergebase, John Speed Meyers Chainguard, George P. Sieniawski IQT Labs, C. Albert Thompson Ford Motor Company, Daniel M. German University of Victoria
20:04
7m
Talk
Mining Code Review Data to Understand Waiting Times Between Acceptance and Merging: An Empirical Analysis
Technical Papers
Gunnar Kudrjavets University of Groningen, Aditya Kumar Snap, Inc., Nachiappan Nagappan Microsoft Research, Ayushi Rastogi University of Groningen, The Netherlands
DOI Pre-print
20:11
7m
Talk
Methods for Stabilizing Models across Large Samples of Projects(with case studies on Predicting Defect and Project Health)
Technical Papers
Suvodeep Majumder North Carolina State University, Tianpei Xia North Carolina State University, Rahul Krishna North Carolina State University, Tim Menzies North Carolina State University
Pre-print Media Attached
20:18
7m
Talk
Do Small Code Changes Merge Faster? A Multi-Language Empirical Investigation
Technical Papers
Gunnar Kudrjavets University of Groningen, Nachiappan Nagappan Microsoft Research, Ayushi Rastogi University of Groningen, The Netherlands
DOI Pre-print
20:25
7m
Talk
FaST: A linear time stack trace alignment heuristic for crash report deduplication
Technical Papers
Irving Muller Rodrigues Polytechnique Montreal, Montreal, Canada, Daniel Aloise Polytechnique Montreal, Eraldo Rezende Fernandes Leuphana University of Lüneburg
DOI Pre-print
20:32
4m
Talk
TwinDroid: A Dataset of Android app System call traces and Trace Generation Pipeline
Data and Tool Showcase Track
Asma Razgallah Université du Québec à Chicoutimi, Canada, Raphael Khoury Université du Québec à Chicoutimi, Canada, Jean-Baptiste Poulet Université du Québec à Chicoutimi, Canada
20:36
14m
Live Q&A
Discussions and Q&A
Technical Papers

21:00 - 21:50
Session 13: Security & QualityTechnical Papers / Data and Tool Showcase Track / Registered Reports / Industry Track at MSR Main room - odd hours
Chair(s): Gias Uddin University of Calgary, Canada
21:00
7m
Talk
On the Use of Fine-grained Vulnerable Code Statements for Software Vulnerability Assessment Models
Technical Papers
Triet Le Huynh Minh The University of Adelaide, Muhammad Ali Babar University of Adelaide
Pre-print
21:07
7m
Talk
LineVD: Statement-level Vulnerability Detection using Graph Neural Networks
Technical Papers
David Hin The University of Adelaide, Andrey Kan The University of Adelaide, Huaming Chen The University of Adelaide, Muhammad Ali Babar University of Adelaide
21:14
7m
Talk
LineVul: A Transformer-based Line-Level Vulnerability Prediction
Technical Papers
Michael Fu Monash University, Kla Tantithamthavorn Monash University
Pre-print
21:21
4m
Talk
ECench: An Energy Bug Benchmark of Ethereum Client Software
Data and Tool Showcase Track
Jinyoung Kim Sungkyunkwan University, Misoo Kim Sungkyunkwan University, Eunseok Lee Sungkyunkwan University
21:25
7m
Talk
Microsoft CloudMine: Data Mining for the Executive Order on Improving the Nation’s Cybersecurity
Industry Track
Kim Herzig Tools for Software Engineers, Microsoft, Luke Gostling Microsoft Corporation, Maximilian Grothusmann Microsoft Corporation, Nora Huang Microsoft Corporation, Sascha Just Microsoft, Alan Klimowski Microsoft Corporation, Yashasvini Ramkumar Microsoft Corporation, Myles McLeroy Microsoft Corporation, Kıvanç Muşlu Microsoft, Hitesh Sajnani Microsoft , Varsha Vadaga Microsoft Corporation
21:32
4m
Talk
Evaluating few shot and Contrastive learning Methods for Code Clone Detection
Registered Reports
Mohamad Khajezade University of British Columbia, Fatemeh Hendijani Fard University of British Columbia, Mohamed S Shehata University of British Columbia
Pre-print
21:36
14m
Live Q&A
Discussions and Q&A
Technical Papers

22:00 - 22:50
Foundational Contribution Award SessionTechnical Papers at MSR Plenary room
Chair(s): Miryung Kim University of California at Los Angeles, USA
22:00
50m
Awards
MSR Foundational Contribution Award
Technical Papers
Dongmei Zhang Microsoft Research, Tao Xie Peking University

Fri 20 May

Displayed time zone: Eastern Time (US & Canada) change

04:00 - 04:50
Session 14: Software Quality Technical Papers / Industry Track / Data and Tool Showcase Track at MSR Main room - even hours
Chair(s): Kla Tantithamthavorn Monash University, Simone Scalabrino University of Molise
04:00
4m
Short-paper
Evaluating the effectiveness of local explanation methods on source code-based defect prediction models
Technical Papers
Yuxiang Gao Jiangsu Normal University, Yi Zhu Jiangsu Normal University, Qiao YU Jiangsu Normal University
Pre-print
04:04
7m
Talk
Problems and Solutions in Applying Continuous Integration and Delivery to 20 Open-Source Cyber-Physical Systems
Technical Papers
Fiorella Zampetti University of Sannio, Italy, Vittoria Nardone University of Sannio, Massimiliano Di Penta University of Sannio, Italy
04:11
7m
Talk
To Type or Not to Type? A Systematic Comparison of the Software Quality of JavaScript and TypeScript Applications on GitHub
Technical Papers
Justus Bogner University of Stuttgart, Institute of Software Engineering, Empirical Software Engineering Group, Manuel Merkel University of Stuttgart
Pre-print
04:18
7m
Talk
Using Bandit Algorithms for Selecting Feature Reduction Techniques in Software Defect Prediction
Technical Papers
Masateru Tsunoda Kindai University, Akito Monden Okayama University, Koji Toda Fukuoka Institute of Technology, Amjed Tahir Massey University, Kwabena Ebo Bennin Wageningen University and Research, Keitaro Nakasai National Institute of Technology, Kagoshima College, Masataka Nagura Nanzan University, Kenichi Matsumoto Nara Institute of Science and Technology
Pre-print
04:25
4m
Talk
Constructing Dataset of Functionally Equivalent Java Methods Using Automated Test Generation Techniques
Data and Tool Showcase Track
Yoshiki Higo Osaka University, Shinsuke Matsumoto Osaka University, Shinji Kusumoto Osaka University, Kazuya Yasuda Hitachi, Ltd.
Media Attached
04:29
7m
Talk
Extracting corrective actions from code repositories
Industry Track
Yegor Bugayenko Huawei, Kirill Daniakin Innopolis University, Mirko Farina Innopolis University, Firas Jolha Innopolis University, Artem Kruglov Innopolis University, Witold Pedrycz University of Alberta, Giancarlo Succi Innopolis University
04:36
14m
Live Q&A
Discussions and Q&A
Technical Papers

05:00 - 05:30
Closing Session of Virtual MSR 2022 + Introduction of MSR 2023Technical Papers at MSR Plenary room

Speakers: David Lo, Shane McIntosh, Nicole Novielli, Emad Shihab

10:00 - 10:50
Shadow PC RetrospectiveTechnical Papers at MSR Main room - even hours
Chair(s): Eleni Constantinou Eindhoven University of Technology, Sarah Nadi University of Alberta

Closed to Shadow PC Members.

11:00 - 11:50
Session 15: Collaboration & Open SourceRegistered Reports / Data and Tool Showcase Track / Technical Papers / Industry Track at MSR Main room - odd hours
Chair(s): Massimiliano Di Penta University of Sannio, Italy, Fiorella Zampetti University of Sannio, Italy
11:00
7m
Talk
Code Review Practices for Refactoring Changes: An Empirical Study on OpenStack
Technical Papers
Eman Abdullah AlOmar Stevens Institute of Technology, Moataz Chouchen ETS, Mohamed Wiem Mkaouer Rochester Institute of Technology, Ali Ouni ETS Montreal, University of Quebec
Pre-print
11:07
4m
Talk
FixJS: A Dataset of Bug-fixing JavaScript Commits
Data and Tool Showcase Track
Viktor Csuvik Department of Software Engineering, MTA-SZTE Research Group on Artificial Intelligence, University of Szeged, Szeged, Hungary, László Vidács University of Szeged, Hungary
File Attached
11:11
4m
Talk
A Time Series-Based Dataset of Open-Source Software Evolution
Data and Tool Showcase Track
Bruno L. Sousa UFMG, Mariza Bigonha Professor at Federal University of Minas Gerais, Kecia A. M. Ferreira CEFET-MG, Glaura C. Franco UFMG
DOI Pre-print Media Attached
11:15
4m
Talk
LAGOON: An Analysis Tool for Open Source Communities
Data and Tool Showcase Track
Sourya Dey Galois, Inc., Walt Woods Galois, Inc.
Pre-print Media Attached
11:19
4m
Talk
A Versatile Dataset of Agile Open Source Software Projects
Data and Tool Showcase Track
Vali Tawosi University College London, Afnan Al-Subaihin University College London, Rebecca Moussa University College London, Federica Sarro University College London
Link to publication DOI Pre-print Media Attached
11:23
7m
Talk
Automatically Prioritizing and Assigning Tasks from Code Repositories in Puzzle Driven Development
Industry Track
Ayomide Bakare Innopolis University, Yegor Bugayenko Huawei, Arina Cheverda Innopolis University, Mirko Farina Innopolis University, Artem Kruglov Innopolis University, Witold Pedrycz University of Alberta, Giancarlo Succi Innopolis University
11:30
4m
Talk
Towards Understanding Barriers and Mitigation Strategies of Software Engineers with Non-traditional Educational and Occupational Backgrounds
Registered Reports
Tavian Barnes University of Waterloo, Ken Jen Lee University of Waterloo, Cristina Tavares University of Waterloo, Gema Rodríguez-Pérez University of British Columbia (UBC), Mei Nagappan University of Waterloo
Pre-print
11:34
4m
Talk
Can instability variations warn developers when open-source projects boost?
Registered Reports
Alejandro Valezate Rey Juan Carlos University, Rafael Capilla Universidad Rey Juan Carlos, Gregorio Robles Universidad Rey Juan Carlos, Victor Salamanca Rey Juan Carlos University
Pre-print
11:38
12m
Live Q&A
Discussions and Q&A
Technical Papers

12:00 - 12:50
Tutorial: Using Datalore for Reproducible ResearchTutorials at MSR Main room - odd hours
12:00
50m
Tutorial
Using Datalore for Reproducible Research
Tutorials
Jodie Burchell JetBrains
13:00 - 13:50
HackathonHackathon / Technical Papers at MSR Main room - odd hours
Chair(s): Gregorio Robles Universidad Rey Juan Carlos, Jesus M. Gonzalez-Barahona Universidad Rey Juan Carlos, Maëlick Claes University of Oulu
13:00
5m
Talk
Bot Detection in GitHub Repositories
Hackathon
Natarajan Chidambaram University of Mons, Pooya Rostami Mazrae University of Mons
DOI Pre-print
13:05
5m
Talk
GitRank: A Framework to Rank GitHub Repositories
Hackathon
Pre-print Media Attached
13:10
5m
Talk
GrimoireLab Maintenance and Evolution
Hackathon
Willem Meijer University of Groningen, David Visscher University of Groningen, Erwin de Haan University of Groningen, Merijn Schröder University of Groningen, Leon Visscher University of Groningen, Andrea Capiluppi University of Groningen, Ioan Botez University of Groningen
Link to publication DOI Pre-print Media Attached
13:15
5m
Talk
OpenSSL 3.0.0: An exploratory case study
Hackathon
James Walden Northern Kentucky University
Pre-print
13:20
5m
Talk
Quid Pro Quo: An Exploration of Reciprocity in Code Review
Hackathon
Carlos Gavidia-Calderon The Open University, UK, DongGyun Han Singapore Management University, Amel Bennaceur The Open University
Pre-print Media Attached
13:25
5m
Talk
Replicating Data Pipelines with GrimoireLab
Hackathon
Kalvin Eng University of Alberta, Hareem Sahar University of Alberta
Pre-print
13:30
20m
Live Q&A
Discussions and Q&A
Technical Papers

13:00 - 13:50
Tutorial: Software Bots in Software Engineering: Benefits and ChallengesTutorials at MSR Tutorials room
13:00
50m
Tutorial
Software Bots in Software Engineering: Benefits and Challenges
Tutorials
Mairieli Wessel Delft University of Technology, Marco Gerosa Northern Arizona University, USA, Emad Shihab Concordia University
14:00 - 15:00
Session 16: Non-functional Properties (Availability, Security, Legal Aspects)Industry Track / Technical Papers / Registered Reports / Data and Tool Showcase Track at MSR Main room - even hours
Chair(s): Maxime Lamothe Polytechnique Montreal, Montreal, Canada, Jin L.C. Guo McGill University
14:00
7m
Talk
A Deep Study of the Effects and Fixes of Server-Side Request Races in Web Applications
Technical Papers
Zhengyi Qiu North Carolina State University, Shudi Shao North Carolina State University, Qi Zhao North Carolina State University, Hassan Ali Khan North Carolina State University, Xinning Hui North Carolina State University, Guoliang Jin North Carolina State University
Media Attached
14:07
4m
Talk
A Large-scale Dataset of (Open Source) License Text VariantsData and Tool Showcase Award
Data and Tool Showcase Track
Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
DOI Pre-print
14:11
7m
Talk
SECOM: Towards a convention for security commit messagesFOSS Impact Paper Award
Industry Track
Sofia Reis Instituto Superior Técnico, U. Lisboa & INESC-ID, Rui Abreu Faculty of Engineering, University of Porto, Portugal, Hakan Erdogmus Carnegie Mellon University, Corina S. Păsăreanu Carnegie Mellon University
Pre-print
14:18
7m
Talk
Varangian: A Git Bot for Augmented Static Analysis
Industry Track
Saurabh Pujar IBM Research, Yunhui Zheng IBM Research, Luca Buratti IBM Research, Burn Lewis IBM Research, Alessandro Morari IBM Research, Jim A. Laredo IBM Research, Kevin Postlethwait Red Hat, Christoph Görn Red Hat
14:25
7m
Talk
Detecting Privacy-Sensitive Code Changes with Language Modeling
Industry Track
Gökalp Demirci Meta Platforms, Inc., Vijayaraghavan Murali Meta Platforms, Inc., Imad Ahmad Meta Platforms, Inc., Rajeev Rao Meta Platforms, Inc., Gareth Ari Aye Meta Platforms, Inc.
14:32
4m
Talk
Is GitHub's Copilot as Bad As Humans at Introducing Vulnerabilities in Code?
Registered Reports
Owura Asare University of Waterloo, Mei Nagappan University of Waterloo, N. Asokan University of Waterloo
Pre-print
14:36
7m
Talk
Finding the Fun in Fundraising: Public Issues and Pull Requests in VC-backed Open-Core Companies
Industry Track
Kevin Xu GitHub
14:43
17m
Live Q&A
Discussions and Q&A
Technical Papers

Mon 23 May

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:30
In-Person MSR Opening, Keynote and MIP SessionTechnical Papers / MIP Award at Room 315+316
Chair(s): David Lo Singapore Management University
09:00
20m
Talk
In-Person MSR 2022 Opening Session
Technical Papers
David Lo Singapore Management University, Shane McIntosh University of Waterloo, Nicole Novielli University of Bari
09:20
35m
Keynote
From Models to Systems: Rethinking the Role of Software Engineering for Machine Learning
Technical Papers
Christian Kästner Carnegie Mellon University
09:55
35m
Talk
MIP Award Talk
MIP Award
Georgios Gousios Endor Labs & Delft University of Technology, Diomidis Spinellis Athens University of Economics and Business; Delft University of Technology
11:00 - 12:30
Blended Technical Session 1 (Integration, Large-scale mining, and Software Ecosystems)Technical Papers / Data and Tool Showcase Track at Room 315+316
Chair(s): Bogdan Vasilescu Carnegie Mellon University, USA
11:00
15m
Talk
Do Small Code Changes Merge Faster? A Multi-Language Empirical Investigation
Technical Papers
Gunnar Kudrjavets University of Groningen, Nachiappan Nagappan Microsoft Research, Ayushi Rastogi University of Groningen, The Netherlands
DOI Pre-print
11:15
15m
Talk
Mining Code Review Data to Understand Waiting Times Between Acceptance and Merging: An Empirical Analysis
Technical Papers
Gunnar Kudrjavets University of Groningen, Aditya Kumar Snap, Inc., Nachiappan Nagappan Microsoft Research, Ayushi Rastogi University of Groningen, The Netherlands
DOI Pre-print
11:30
8m
Talk
Dataset: Dependency Networks of Open Source Libraries Available Through CocoaPods, Carthage and Swift PM
Data and Tool Showcase Track
Kristiina Rahkema University of Tartu, Dietmar Pfahl University of Tartu
Pre-print Media Attached
11:38
8m
Talk
A Large-scale Dataset of (Open Source) License Text VariantsData and Tool Showcase Award
Data and Tool Showcase Track
Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
DOI Pre-print
11:46
8m
Talk
TSSB-3M: Mining single statement bugs at massive scale
Data and Tool Showcase Track
Cedric Richter Carl von Ossietzky Universität Oldenburg / University of Oldenburg, Heike Wehrheim Carl von Ossietzky Universität Oldenburg / University of Oldenburg
Pre-print Media Attached
11:54
8m
Talk
LAGOON: An Analysis Tool for Open Source Communities
Data and Tool Showcase Track
Sourya Dey Galois, Inc., Walt Woods Galois, Inc.
Pre-print Media Attached
12:02
8m
Talk
The Unexplored Treasure Trove of Phabricator Code Reviews
Data and Tool Showcase Track
Gunnar Kudrjavets University of Groningen, Nachiappan Nagappan Microsoft Research, Ayushi Rastogi University of Groningen, The Netherlands
DOI Pre-print
12:10
20m
Live Q&A
Discussions and Q&A
Technical Papers

13:30 - 15:00
Blended Technical Session 2 (Machine Learning and Information Retrieval) Technical Papers / Data and Tool Showcase Track at Room 315+316
Chair(s): Preetha Chatterjee Drexel University, USA
13:30
15m
Talk
Methods for Stabilizing Models across Large Samples of Projects(with case studies on Predicting Defect and Project Health)
Technical Papers
Suvodeep Majumder North Carolina State University, Tianpei Xia North Carolina State University, Rahul Krishna North Carolina State University, Tim Menzies North Carolina State University
Pre-print Media Attached
13:45
15m
Talk
GraphCode2Vec: Generic Code Embedding via Lexical and Program Dependence Analyses
Technical Papers
Wei Ma SnT, University of Luxembourg, Mengjie Zhao LMU Munich, Ezekiel Soremekun SnT, University of Luxembourg, Qiang Hu University of Luxembourg, Jie M. Zhang King's College London, Mike Papadakis University of Luxembourg, Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Xiaofei Xie Singapore Management University, Singapore, Yves Le Traon University of Luxembourg, Luxembourg
Pre-print
14:00
15m
Talk
Senatus: A Fast and Accurate Code-to-Code Recommendation Engine
Technical Papers
Fran Silavong JP Morgan Chase & Co., Sean Moran JP Morgan Chase & Co., Antonios Georgiadis JP Morgan Chase & Co., Rohan Saphal JP Morgan Chase & Co., Robert Otter JP Morgan Chase & Co.
DOI Pre-print Media Attached
14:15
8m
Short-paper
Comments on Comments: Where Code Review and Documentation Meet
Technical Papers
Nikitha Rao Carnegie Mellon University, Jason Tsay IBM Research, Martin Hirzel IBM Research, Vincent J. Hellendoorn Carnegie Mellon University
DOI Pre-print File Attached
14:23
8m
Short-paper
On the Naturalness of Fuzzer Generated Code
Technical Papers
Rajeswari Hita Kambhamettu Carnegie Mellon University, John Billos Wake Forest University, Carolyn "Tomi" Oluwaseun-Apo Pennsylvania State University, Benjamin Gafford Carnegie Mellon University, Rohan Padhye Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University
14:31
8m
Talk
SOSum: A Dataset of Stack Overflow Post Summaries
Data and Tool Showcase Track
Bonan Kou Purdue University, Yifeng Di Purdue University, Muhao Chen University of Southern California, Tianyi Zhang Purdue University
14:39
21m
Live Q&A
Discussions and Q&A
Technical Papers

15:30 - 17:00
Networking & Poster SessionTechnical Papers at Room 315+316
Chair(s): Miikka Kuutila University of Oulu

The following are the registered posters:

  • Kristiina Rahkema, Dietmar Pfahl – Dataset: Dependency Networks of Open Source Libraries Available Through CocoaPods, Carthage and Swift PM
  • Ruben Opdebeeck, Ahmed Zerouali, Coen De Roover – Smelly Variables in Ansible Infrastructure Code: Detection, Prevalence, and Lifetime
  • Clara Marie Lüders, Abir Bouraffa, Walid Maalej – Beyond Duplicates: Towards Understanding and Predicting Link Types in Issue Tracking Systems
  • Johannes Härtel, Ralf Laemmel – Operationalizing Threats to MSR Studies by Simulation-Based Testing
  • Michael Schlichtig, Anna-Katharina Wickert, Stefan Krüger, Eric Bodden, Mira Mezini – CamBench - Cryptographic API Misuse Detection Tool Benchmark Suite
  • Sourya Dey, Walt Woods – LAGOON: An Analysis Tool for Open Source Communities
  • Cedric Richter, Heike Wehrheim – TSSB-3M: Mining single statement bugs at massive scale
  • Tatiana Castro Velez, Raffi Khatchadourian, Mehdi Bagherzadeh, Anita Raja – Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study
  • Rahul Yedida, Tim Menzies – How to Improve Deep Learning for Software Analytics (a case study with code smell detection)
  • Suvodeep Majumder, Tianpei Xia, Rahul Krishna, Tim Menzies – Methods for Stabilizing Models across Large Samples of Projects (with case studies on Predicting Defect and Project Health)
  • Anirudh Ramchandran, Likang Yin, Vladimir Filkov – Exploring Apache Incubator Project Trajectories with APEX
  • Nikitha Rao, Jason Tsay, Martin Hirzel, Vincent J. Hellendoorn – Comments on Comments: Where Code Review and Documentation Meet
  • Kimberly Truong, Courtney Miller, Bogdan Vasilescu, Christian Kästner – The Unsolvable Problem or the Unheard Answer? A Dataset of 24,669 Open-Source Software Conference Talks
  • Anthony Peruma, Eman Abdullah AlOmar, Christian D. Newman, Mohamed Wiem Mkaouer, Ali Ouni – Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship Between Technical Debt and Refactoring

Tue 24 May

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:30
Blended Technical Session 3 (Smells and Maintenance)Technical Papers / Mining Challenge / Registered Reports / Data and Tool Showcase Track at Room 315+316
Chair(s): Andy Zaidman Delft University of Technology
09:00
15m
Talk
Smelly Variables in Ansible Infrastructure Code: Detection, Prevalence, and Lifetime
Technical Papers
Ruben Opdebeeck Vrije Universiteit Brussel, Ahmed Zerouali Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel
Pre-print
09:15
15m
Talk
Beyond Duplicates: Towards Understanding and Predicting Link Types in Issue Tracking Systems
Technical Papers
Clara Marie Lüders University of Hamburg, Abir Bouraffa University of Hamburg, Walid Maalej University of Hamburg
DOI Pre-print
09:30
15m
Talk
How to Improve Deep Learning for Software Analytics (a case study with code smell detection)
Technical Papers
Rahul Yedida , Tim Menzies North Carolina State University
Pre-print
09:45
8m
Talk
npm-filter: Automating the mining of dynamic information from npm packages
Data and Tool Showcase Track
Ellen Arteca Northeastern University, Alexi Turcotte Northeastern University
Pre-print Media Attached
09:53
8m
Talk
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship Between Technical Debt and RefactoringBest Mining Challenge Paper Award
Mining Challenge
Anthony Peruma Rochester Institute of Technology, Eman Abdullah AlOmar Stevens Institute of Technology, Christian D. Newman Rochester Institute of Technology, Mohamed Wiem Mkaouer Rochester Institute of Technology, Ali Ouni ETS Montreal, University of Quebec
Pre-print Media Attached
10:01
8m
Talk
CamBench - Cryptographic API Misuse Detection Tool Benchmark Suite
Registered Reports
Michael Schlichtig Heinz Nixdorf Institute at Paderborn University, Anna-Katharina Wickert TU Darmstadt, Germany, Stefan Krüger Independent Researcher, Eric Bodden University of Paderborn; Fraunhofer IEM, Mira Mezini TU Darmstadt
Pre-print
10:09
21m
Live Q&A
Discussions and Q&A
Technical Papers

11:00 - 12:15
Blended Technical Session 4 (Introspection, Vision, and Human Aspects)Technical Papers / Registered Reports / Data and Tool Showcase Track at Room 315+316
Chair(s): Ayushi Rastogi University of Groningen, The Netherlands
11:00
15m
Talk
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study
Technical Papers
Tatiana Castro Vélez City University of New York (CUNY) Graduate Center, Raffi Khatchadourian City University of New York (CUNY) Hunter College, Mehdi Bagherzadeh Oakland University, Anita Raja City University of New York (CUNY) Hunter College
Pre-print Media Attached
11:15
15m
Talk
Operationalizing Threats to MSR Studies by Simulation-Based TestingDistinguished Paper Award
Technical Papers
Johannes Härtel University of Koblenz-Landau, Germany, Ralf Laemmel Facebook London
Pre-print Media Attached
11:30
8m
Short-paper
Geographic Diversity in Public Code Contributions
Technical Papers
Davide Rossi University of Bologna, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
Pre-print Media Attached
11:38
8m
Talk
The General Index of Software Engineering Papers
Data and Tool Showcase Track
Zeinab Abou Khalil Inria, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
DOI Pre-print
11:46
8m
Talk
Investigating the Impact of Forgetting in Software Development
Registered Reports
Utku Unal METU, Eray Tüzün Bilkent University, Tamer Gezici Bilkent University, Ausaf Ahmed Farooqui Bilkent University
Pre-print
11:54
21m
Live Q&A
Discussions and Q&A
Technical Papers

12:15 - 12:30
Brainstorming / Discussion ITechnical Papers at Room 315+316
Chair(s): Shane McIntosh University of Waterloo
13:30 - 15:00
Brainstorming / Discussion II Technical Papers at Room 315+316
Chair(s): Shane McIntosh University of Waterloo
15:30 - 17:00
Blended Technical Session 5 (Miscellaneous) Technical Papers / Data and Tool Showcase Track / Mining Challenge at Room 315+316
Chair(s): Luís Cruz Deflt University of Technology
15:30
15m
Talk
Code Review Practices for Refactoring Changes: An Empirical Study on OpenStack
Technical Papers
Eman Abdullah AlOmar Stevens Institute of Technology, Moataz Chouchen ETS, Mohamed Wiem Mkaouer Rochester Institute of Technology, Ali Ouni ETS Montreal, University of Quebec
Pre-print
15:45
15m
Talk
Painting the Landscape of Automotive Software in GitHub
Technical Papers
Sangeeth Kochanthara Eindhoven University of Technology, Yanja Dajsuren Eindhoven University of Technology, Loek Cleophas Eindhoven University of Technology (TU/e) and Stellenbosch University (SU), Mark van den Brand Eindhoven University of Technology
Pre-print Media Attached
16:00
8m
Talk
SLNET: A Redistributable Corpus of 3rd-party Simulink Models
Data and Tool Showcase Track
Sohil Lal Shrestha The University of Texas at Arlington, Shafiul Azam Chowdhury University of Texas at Arlington, Christoph Csallner University of Texas at Arlington
DOI Pre-print Media Attached
16:08
8m
Talk
SoCCMiner: A Source Code-Comments and Comment-Context Miner
Data and Tool Showcase Track
Murali Sridharan University of Oulu, Mika Mäntylä University of Oulu, Maëlick Claes University of Oulu, Leevi Rantala University of Oulu
Pre-print
16:16
8m
Talk
An Exploratory Study on Refactoring Documentation in Issues Handling
Mining Challenge
Eman Abdullah AlOmar Stevens Institute of Technology, Anthony Peruma Rochester Institute of Technology, Mohamed Wiem Mkaouer Rochester Institute of Technology, Christian D. Newman Rochester Institute of Technology, Ali Ouni ETS Montreal, University of Quebec
Pre-print
16:24
8m
Talk
Between JIRA and GitHub: ASFBot and its Influence on Human Comments in Issue Trackers
Mining Challenge
Ambarish Moharil Eindhoven University of Technology, Dmitrii Orlov Eindhoven University of Technology, Samar Jameel Eindhoven University of Technology, Tristan Trouwen Eindhoven University of Technology, Nathan Cassee Eindhoven University of Technology, Alexander Serebrenik Eindhoven University of Technology
Pre-print
16:32
28m
Live Q&A
Discussions and Q&A
Technical Papers

17:00 - 17:30
MSR Award and Closing SessionTechnical Papers at Room 315+316

Accepted Papers

Title
CamBench - Cryptographic API Misuse Detection Tool Benchmark Suite
Registered Reports
Pre-print
Can instability variations warn developers when open-source projects boost?
Registered Reports
Pre-print
Evaluating few shot and Contrastive learning Methods for Code Clone Detection
Registered Reports
Pre-print
Investigating the Impact of Forgetting in Software Development
Registered Reports
Pre-print
Is GitHub's Copilot as Bad As Humans at Introducing Vulnerabilities in Code?
Registered Reports
Pre-print
Is Surprisal in Issue Trackers Actionable?
Registered Reports
DOI Pre-print Media Attached
Recommending Code Improvements Based on Stack Overflow Answer Edits
Registered Reports
Pre-print
Toward Granular Automatic Unit Test Case Generation
Registered Reports
Pre-print
Towards Understanding Barriers and Mitigation Strategies of Software Engineers with Non-traditional Educational and Occupational Backgrounds
Registered Reports
Pre-print
Towards Using Gameplay Videos for Detecting Issues in Video Games
Registered Reports
Pre-print

Call for Registrations

Empirical Software Engineering Journal (EMSE), in conjunction with the conference on Mining Software Repositories (MSR), is continuing the RR track. The RR track of MSR 2022 has two goals: (1) to prevent HARKing (hypothesizing after the results are known) for empirical studies; (2) to provide early feedback to authors in their initial study design. For papers submitted to the RR track, methods and proposed analyses are reviewed prior to execution. Pre-registered studies follow a two-step process:

  • Stage 1: A report is submitted that describes the planned study. The submitted report is evaluated by the reviewers of the RR track of MSR 2022. Authors of accepted pre-registered studies will be given the opportunity to present their work at MSR.
  • Stage 2: Once a report has passed Phase 1, the study will be conducted and actual data collection and analysis take place. The results may also be negative! The full paper is submitted for review to EMSE.

Paper Types, Evaluation Criteria, and Acceptance Types

The RR track of MSR 2022 supports two types of papers:

Confirmatory: The researcher has a fixed hypothesis (or several fixed hypotheses) and the objective of the study is to find out whether the hypothesis is supported by the facts/data.

An example of a completed confirmatory study:

  • Inozemtseva, L., & Holmes, R. (2014, May). Coverage is not strongly correlated with test suite effectiveness. In Proceedings of the 36th international conference on software engineering (pp. 435-445).

Exploratory: The researcher does not have a hypothesis (or has one that may change during the study). Often, the objective of such a study is to understand what is observed and answer questions such as WHY, HOW, WHAT, WHO, or WHEN. We include in this category registrations for which the researcher has an initial proposed solution for an automated approach (e.g., a new deep-learning-based defect prediction approach) that serves as a starting point for his/her exploration to reach an effective solution.

Examples of completed exploratory studies:

  • Gousios, G., Pinzger, M., & Deursen, A. V. (2014, May). An exploratory study of the pull-based software development model. In Proceedings of the 36th International Conference on Software Engineering (pp. 345-355).
  • Rodrigues, I. M., Aloise, D., Fernandes, E. R., & Dagenais, M. (2020, June). A Soft Alignment Model for Bug Deduplication. In Proceedings of the 17th International Conference on Mining Software Repositories (pp. 43-53).

The reviewers will evaluate RR track submissions based on the following criteria:

  • The importance of the research question(s).
  • The logic, rationale, and plausibility of the proposed hypotheses.
  • The soundness and feasibility of the methodology and analysis pipeline (including statistical power analysis where appropriate).
  • (For confirmatory study) Whether the clarity and degree of methodological detail is sufficient to exactly replicate the proposed experimental procedures and analysis pipeline.
  • (For confirmatory study) Whether the authors have pre-specified sufficient outcome-neutral tests for ensuring that the results obtained can test the stated hypotheses, including positive controls and quality checks.
  • (For exploratory study, if applicable) The description of the data set that is the base for exploration.

The outcome of the RR report review is one of the following:

  • In-Principal Acceptance (IPA): The reviewers agree that the study is relevant, the outcome of the study (whether confirmation / rejection of hypothesis) is of interest to the community, the protocol for data collection is sound, and that the analysis methods are adequate. The authors can engage in the actual study for Stage 2. If the protocol is adhered to (or deviations are thoroughly justified), the study is published. Of course, this being a journal submission, a revision of the submitted manuscript may be necessary. Reviewers will especially evaluate how precisely the protocol of the accepted pre-registered report is followed, or whether deviations are justified.
  • Continuity Acceptance (CA): The reviewers agree that the study is relevant, that the (initial) methods appear to be appropriate. However, for exploratory studies, implementation details and post-experiment analyses or discussion (e.g., why the proposed automated approach does not work) may require follow-up checks. We’ll try our best to get the original reviewers. All PC members will be invited on the condition that they agree to review papers in both, Stage 1 and Stage 2. Four (4) PC members will review the Stage 1 submission, and three (3) will review the Stage 2 submission.
  • Rejection The reviewers do not agree on the relevance of the study or are not convinced that the study design is sufficiently mature. Comments are provided to the authors to improve the study design before starting it.

Note: For MSR 2022, a confirmatory study is granted only a IPA. Exploratory study in software engineering often cannot be adequately assessed until after the study has been completed and the findings are elaborated and discussed in a full paper. For example, consider a study in an RR proposing defect prediction using a new deep learning architecture. This work falls under the exploratory category. It is difficult to offer IPA, as we do not know whether it is any better than a traditional approach based on e.g., decision trees. Negative results are welcome; however, it is important that the negative results paper goes beyond presenting “we tried and failed”, but rather provide interesting insights to readers, e.g., why the results are negative or what that means for further studies on this topic (following criteria of REplication and Negative Results (RENE) tracks, e.g., https://saner2019.github.io/cfp/RENETrack.html). Furthermore, it is important to note that authors are required to document all deviations (if any) in a section of the paper.

Submission Process and Instructions

The timeline for MSR 2022 RR track will be as follows:

Feb 4: Authors submit their initial report. * Submissions must not exceed 6 pages (plus 1 additional page of references). The page limit is strict. Please register by Feb 2nd and then you can edit up until Feb 4th AoE.

*All authors should use the official “ACM Primary Article Template”, as can be obtained from the ACM Proceedings Template page. LaTeX users should use the sigconf option, as well as the review (to produce line numbers for easy reference by the reviewers). To that end, the following LaTeX code can be placed at the start of the LaTeX document:

\documentclass[sigconf,review]{acmart}

Mar 4: Authors receive PC members’ reviews.

Mar 18: Authors submit a response letter + revised report in a single PDF.

  • The response letter should address reviewer comments and questions.
  • The response letter + revised report must not exceed 12 pages (plus 1 additional page of references).
  • The response letter does not need to follow ACM formatting instructions.

April 8: Notification of Stage 1

  • (Outcome: in-principal acceptance, continuity acceptance, or rejection).

April 15: Authors submit their accepted RR report to arXiv

  • To be checked by PC members for Stage 2
  • Note: Due to the timeline, RR reports will not be published in the MSR 2022 proceedings.

Before Feb 3, 2023: Authors submit a full paper to EMSE. Instructions will be provided later. However, the following constraints will be enforced:

  • Justifications need to be given to any change of authors. If the authors are added/removed or the author order is changed between the original Stage 1 and the EMSE submission, all authors will need to complete and sign a “Change of authorship request form”. The Editors in Chief of EMSE and chairs of the RR track reserve the right to deny author changes. If you anticipate any authorship changes please reach out to the chairs of the RR track as early as possible.
  • PC members who reviewed an RR report in Stage 1 and their directly supervised students cannot be added as authors of the corresponding submission in Stage 2.

Submissions can be made via the submission site (https://msr2022-registered-report.hotcrp.com/) by the submission deadline. Any submission that does not comply with the aforementioned instructions and the mandatory information specified in the Author Guide is likely to be desk rejected. In addition, by submitting, the authors acknowledge that they are aware of and agree to be bound by the following policies:

  • The ACM Policy and Procedures on Plagiarism and the IEEE Plagiarism FAQ. In particular, papers submitted to MSR 2022 must not have been published elsewhere and must not be under review or submitted for review elsewhere whilst under consideration for MSR 2022. Contravention of this concurrent submission policy will be deemed a serious breach of scientific ethics, and appropriate action will be taken in all such cases (including immediate rejection and reporting of the incident to ACM/IEEE). To check for double submission and plagiarism issues, the chairs reserve the right to (1) share the list of submissions with the PC Chairs of other conferences with overlapping review periods and (2) use external plagiarism detection software, under contract to the ACM or IEEE, to detect violations of these policies.

    The authorship policy of the ACM and the authorship policy of the IEEE.

Author's Guide

NB: Please contact the MSR RR track chairs with any questions, feedback, or requests for clarification. Specific analysis approaches mentioned below are intended as examples, not mandatory components.

I. Title (required)

Provide the working title of your study. It may be the same title that you submit for publication of your final manuscript, but it is not mandatory

Example: Should your family travel with you on the enterprise? Subtitle (optional): Effect of accompanying families on the work habits of crew members

II. Authors (required)

At this stage, we believe that a single blind review is most productive

III. Structured Abstract (required)

The abstract should describe the following in 200 words or so:

  • Background/Context

    What is your research about? Why are you doing this research, why is it interesting?

    Example: “The enterprise is the flag ship of the federation, and it allows families to travel onboard. However, there are no studies that evaluate how this affects the crew members.”

  • Objective/Aim

    What exactly are you studying/investigating/evaluating? What are the objects of the study? We welcome both confirmatory and exploratory types of studies.

    Example (Confirmatory): We evaluate whether the frequency of sick days, the work effectiveness and efficiency differ between science officers who bring their family with them, compared to science officers who are serving without their family.

    Example (Exploratory): We investigate the problem of frequent Holodeck use on interpersonal relationships with an ethnographic study using participant observation, in order to derive specific hypotheses about Holodeck usage.

  • Method How are you addressing your objective? What data sources are you using?

    Example: We conduct an observational study and use a between subject design. To analyze the data, we use a t-test or Wilcoxon test, depending on the underlying distribution. Our data comes from computer monitoring of Enterprise crew members.

IV. Introduction

Give more details on the bigger picture of your study and how it contributes to this bigger picture. An important component of phase 1 review is assessing the importance and relevance of the study questions, so be sure to explain this.

V. Hypotheses (required for confirmatory study) or research questions

Clearly state the research hypotheses that you want to test with your study, and a rationalization for the hypotheses.

Hypothesis: Science officers with their family on board have more sick days than science officers without their family

Rationale: Since toddlers are often sick, we can expect that crew members with their family onboard need to take sick days more often.

VI. Variables (required for confirmatory study)

  • Independent Variable(s) and their operationalization
  • Dependent Variable(s) and their operationalization (e.g., time to solve a specified task)
  • Confounding Variable(s) and how their effect will be controlled (e.g., species type (Vulcan, Human, Tribble) might be a confounding factor; we control for it by separating our sample additionally into Human/Non-Human and using an ANOVA (normal distribution) or Friedman (non-normal distribution) to distill its effect).

For each variable, you should give: - name (e.g., presence of family) - abbreviation (if you intend to use one) - description (whether the family of the crew members travels on board) - scale type (nominal: either the family is present or not) - operationalization (crew members without family on board vs. crew members with family onboard)

VII. Participants/Subjects/Datasets (required)

Describe how and why you select the sample. When you conduct a meta analysis, describe the primary studies / work on which you base your meta analysis.

Example: We recruit crew members from the science department on a voluntary basis. They are our targeted population.

VIII. Execution Plan (required)

Describe the experimental setting and procedure. This includes the methods/tools that you plan to use (be specific on whether you developed it (and how) or whether it is already defined), and the concrete steps that you plan to take to support/reject the hypotheses or answer the research questions.

Example: Each crew member needs to sign the informed consent and agreement to process their data according to GDPR. Then, we conduct the interviews. Afterwards, participants need to complete the simulated task …

Examples:

Confirmatory:

https://osf.io/5fptj/ - Do Explicit Review Strategies Improve Code Review Performance?

Exploratory:

https://osf.io/kfu9t - The Impact of Dynamics of Collaborative Software Engineering on Introverts: A Study Protocol

https://osf.io/acnwk - Large-Scale Manual Validation of Bugfixing Changes