Write a Blog >>
MSR 2022
Mon 23 - Tue 24 May 2022
co-located with ICSE 2022

Despite decades of research, SE lacks widely accepted models (that offer precise quantitative stable predictions) about what factors most influence software quality. This paper provides a promising result showing such stable models can be generated using a new transfer learning framework called “STABILIZER”. Given a tree of recursively clustered projects (using project meta-data), STABILIZER promotes a model upwards if it performs best in the lower clusters (stopping when the promoted model performs worse than the models seen at a lower level).

The number of models found by STABILIZER is minimal: one for defect prediction (756 projects) and less than a dozen for project health (1628 projects). Hence, via STABILIZER, it is possible to find a few projects which can be used for transfer learning and make conclusions that hold across hundreds of projects at a time. Further, the models produced in this manner offer predictions that perform as well or better than the prior state-of-the-art.

To the best of our knowledge, STABILIZER is the order of magnitude faster than the prior state-of-the-art transfer learners which seek to find conclusion stability, and these case studies are the largest demonstration of the generalizability of quantitative predictions of project quality yet reported in the SE literature.

In order to support open science, all our scripts and data are online at https://github.com/Anonymous633671/STABILIZER.

Thu 19 May

Displayed time zone: Eastern Time (US & Canada) change

20:00 - 20:50
Session 12: Integration & Large-Scale MiningTechnical Papers / Data and Tool Showcase Track at MSR Main room - even hours
Chair(s): Jin L.C. Guo McGill University, Amjed Tahir Massey University
20:00
4m
Short-paper
Is Open Source Eating the World’s Software? Measuring the Proportion of Open Source in proprietary software using Java Binaries
Technical Papers
Julius Musseau Mergebase, John Speed Meyers Chainguard, George P. Sieniawski IQT Labs, C. Albert Thompson Ford Motor Company, Daniel M. German University of Victoria
20:04
7m
Talk
Mining Code Review Data to Understand Waiting Times Between Acceptance and Merging: An Empirical Analysis
Technical Papers
Gunnar Kudrjavets University of Groningen, Aditya Kumar Snap, Inc., Nachiappan Nagappan Microsoft Research, Ayushi Rastogi University of Groningen, The Netherlands
DOI Pre-print
20:11
7m
Talk
Methods for Stabilizing Models across Large Samples of Projects(with case studies on Predicting Defect and Project Health)
Technical Papers
Suvodeep Majumder North Carolina State University, Tianpei Xia North Carolina State University, Rahul Krishna North Carolina State University, Tim Menzies North Carolina State University
Pre-print Media Attached
20:18
7m
Talk
Do Small Code Changes Merge Faster? A Multi-Language Empirical Investigation
Technical Papers
Gunnar Kudrjavets University of Groningen, Nachiappan Nagappan Microsoft Research, Ayushi Rastogi University of Groningen, The Netherlands
DOI Pre-print
20:25
7m
Talk
FaST: A linear time stack trace alignment heuristic for crash report deduplication
Technical Papers
Irving Muller Rodrigues Polytechnique Montreal, Montreal, Canada, Daniel Aloise Polytechnique Montreal, Eraldo Rezende Fernandes Leuphana University of Lüneburg
DOI Pre-print
20:32
4m
Talk
TwinDroid: A Dataset of Android app System call traces and Trace Generation Pipeline
Data and Tool Showcase Track
Asma Razgallah Université du Québec à Chicoutimi, Canada, Raphael Khoury Université du Québec à Chicoutimi, Canada, Jean-Baptiste Poulet Université du Québec à Chicoutimi, Canada
20:36
14m
Live Q&A
Discussions and Q&A
Technical Papers

Mon 23 May

Displayed time zone: Eastern Time (US & Canada) change

13:30 - 15:00
Blended Technical Session 2 (Machine Learning and Information Retrieval) Technical Papers / Data and Tool Showcase Track at Room 315+316
Chair(s): Preetha Chatterjee Drexel University, USA
13:30
15m
Talk
Methods for Stabilizing Models across Large Samples of Projects(with case studies on Predicting Defect and Project Health)
Technical Papers
Suvodeep Majumder North Carolina State University, Tianpei Xia North Carolina State University, Rahul Krishna North Carolina State University, Tim Menzies North Carolina State University
Pre-print Media Attached
13:45
15m
Talk
GraphCode2Vec: Generic Code Embedding via Lexical and Program Dependence Analyses
Technical Papers
Wei Ma SnT, University of Luxembourg, Mengjie Zhao LMU Munich, Ezekiel Soremekun SnT, University of Luxembourg, Qiang Hu University of Luxembourg, Jie M. Zhang King's College London, Mike Papadakis University of Luxembourg, Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Xiaofei Xie Singapore Management University, Singapore, Yves Le Traon University of Luxembourg, Luxembourg
Pre-print
14:00
15m
Talk
Senatus: A Fast and Accurate Code-to-Code Recommendation Engine
Technical Papers
Fran Silavong JP Morgan Chase & Co., Sean Moran JP Morgan Chase & Co., Antonios Georgiadis JP Morgan Chase & Co., Rohan Saphal JP Morgan Chase & Co., Robert Otter JP Morgan Chase & Co.
DOI Pre-print Media Attached
14:15
8m
Short-paper
Comments on Comments: Where Code Review and Documentation Meet
Technical Papers
Nikitha Rao Carnegie Mellon University, Jason Tsay IBM Research, Martin Hirzel IBM Research, Vincent J. Hellendoorn Carnegie Mellon University
DOI Pre-print File Attached
14:23
8m
Short-paper
On the Naturalness of Fuzzer Generated Code
Technical Papers
Rajeswari Hita Kambhamettu Carnegie Mellon University, John Billos Wake Forest University, Carolyn "Tomi" Oluwaseun-Apo Pennsylvania State University, Benjamin Gafford Carnegie Mellon University, Rohan Padhye Carnegie Mellon University, Vincent J. Hellendoorn Carnegie Mellon University
14:31
8m
Talk
SOSum: A Dataset of Stack Overflow Post Summaries
Data and Tool Showcase Track
Bonan Kou Purdue University, Yifeng Di Purdue University, Muhao Chen University of Southern California, Tianyi Zhang Purdue University
14:39
21m
Live Q&A
Discussions and Q&A
Technical Papers


Information for Participants
Thu 19 May 2022 20:00 - 20:50 at MSR Main room - even hours - Session 12: Integration & Large-Scale Mining Chair(s): Jin L.C. Guo, Amjed Tahir
Info for room MSR Main room - even hours:

Click here to go to the room on Midspace