Revisiting Process versus Product Metrics: a Large Scale Analysi (ICSE 2022 - Journal-First Papers)

Write a Blog >>

Sun 8 - Fri 27 May 2022

Who

Suvodeep Majumder, Pranav Mody, Tim Menzies

Track

ICSE 2022 Journal-First Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 9 May 2022 20:00 - 20:05 at ICSE room 1 - Machine Learning with and for SE 4 Chair(s): Gias Uddin
Thu 12 May 2022 13:05 - 13:10 at ICSE room 4 - Machine Learning with and for SE 12 Chair(s): Wei Yang

Abstract

Numerous methods can build predictive models from software data. However, what methods and conclusions should we endorse as we move from analytics in-the-small (dealing with a handful of projects) to analytics in-the-large (dealing with hundreds of projects)?

To answer this question, we recheck prior small-scale results (about process versus product metrics for defect prediction and the granularity of metrics) using 722,471 commits from 700 Github projects. We find that some analytics in-the-small conclusions still hold when scaling up to analytics in-the-large. For example, like prior work, we see that process metrics are better predictors for defects than product metrics (best process/product-based learners respectively achieve recalls of 98%/44% and AUCs of 95%/54%, median values).

That said, we warn that it is unwise to trust metric importance results from analytics in-the-small studies since those change dramatically when moving to analytics in-the-large. Also, when reasoning in-the-large about hundreds of projects, it is better to use predictions from multiple models (since single model predictions can become confused and exhibit a high variance).

Link to Publication

https://link.springer.com/article/10.1007/s10664-021-10068-4

Link to Preprint

https://arxiv.org/pdf/2008.09569.pdf

DOI

https://doi.org/10.1007/s10664-021-10068-4

File attachments

Revisiting Process versus Product Metrics: a Large Scale Analysis (Revisiting Process versus Product Metrics- a Large Scale Analysis.pdf)	2.3MiB

Suvodeep Majumder

North Carolina State University

Pranav Mody

North Carolina State University

Tim Menzies

North Carolina State University

United States

Revisiting Process versus Product Metrics: a Large Scale Analysis

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 9 May
Displayed time zone: Eastern Time (US & Canada) change

20:00 - 21:00	Machine Learning with and for SE 4Journal-First Papers / Technical Track / SEIP - Software Engineering in Practice at ICSE room 1 Chair(s): Gias Uddin University of Calgary, Canada

5m Talk		Revisiting Process versus Product Metrics: a Large Scale Analysi Journal-First Papers Suvodeep Majumder North Carolina State University, Pranav Mody North Carolina State University, Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached File Attached
5m Talk		Learning to Recognize Actionable Static Code Warnings (is Intrinsically Easy) Journal-First Papers Xueqi Yang NCSU, Jianfeng Chen North Carolina State University, Rahul Yedida North Carolina State University, Zhe Yu , Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached
5m Talk		Mining Root Cause Knowledge from Cloud Service Incident Investigations for AIOps SEIP - Software Engineering in Practice Amrita Saha Salesforce Research Asia, Steven C.H. Hoi Salesforce Research Asia Pre-print Media Attached
5m Talk		FairNeuron: Improving Deep Neural Network Fairness with Adversary Games on Selective Neurons Technical Track Xuanqi Gao Xi'an Jiaotong University, Juan Zhai Rutgers University, Shiqing Ma Rutgers University, Chao Shen Xi'an Jiaotong University, Yufei Chen Xi'an Jiaotong University, Qian Wang Wuhan University DOI Pre-print Media Attached
5m Talk		EREBA: Black-box Energy Testing of Adaptive Neural Networks Technical Track Mirazul Haque UT Dallas, Yaswanth Yadlapalli University of Texas at Dallas, Wei Yang University of Texas at Dallas, Cong Liu University of Texas at Dallas, USA Pre-print Media Attached
5m Talk		Training Data Debugging for the Fairness of Machine Learning Software Technical Track Yanhui Li Department of Computer Science and Technology, Nanjing University, Linghan Meng Nanjing University, Lin Chen Department of Computer Science and Technology, Nanjing University, Li Yu Nanjing University, Di Wu Momenta, Yuming Zhou Nanjing University, Baowen Xu Nanjing University Pre-print Media Attached

Thu 12 May
Displayed time zone: Eastern Time (US & Canada) change

13:00 - 14:00	Machine Learning with and for SE 12Journal-First Papers / Technical Track / NIER - New Ideas and Emerging Results at ICSE room 4 Chair(s): Wei Yang University of Texas at Dallas

5m Talk		Modeling Functional Similarity in Source Code with Graph-Based Siamese Networks Journal-First Papers NIKITA MEHROTRA Indraprastha Institute of Information Technology, NAVDHA AGARWAL Indraprastha Institute of Information Technology, Delhi, PIYUSH GUPTA Indraprastha Institute of Information Technology, Delhi, SAKET ANAND Indraprastha Institute of Information Technology, Delhi, David Lo Singapore Management University, Rahul Purandare IIIT-Delhi Link to publication DOI Media Attached
5m Talk		Revisiting Process versus Product Metrics: a Large Scale Analysi Journal-First Papers Suvodeep Majumder North Carolina State University, Pranav Mody North Carolina State University, Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached File Attached
5m Talk		Learning to Recognize Actionable Static Code Warnings (is Intrinsically Easy) Journal-First Papers Xueqi Yang NCSU, Jianfeng Chen North Carolina State University, Rahul Yedida North Carolina State University, Zhe Yu , Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached
5m Talk		Improving the Learnability of Machine Learning APIs by Semi-Automated API Wrapping NIER - New Ideas and Emerging Results Lars Reimann University of Bonn, Günter Kniesel-Wünsche University of Bonn DOI Pre-print Media Attached
5m Talk		Improving Machine Translation Systems via Isotopic Replacement Technical Track Zeyu Sun Peking University, Jie M. Zhang King's College London, Yingfei Xiong Peking University, Mark Harman University College London, Mike Papadakis University of Luxembourg, Luxembourg, Lu Zhang Peking University Pre-print Media Attached
5m Talk		Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and ProcessDistinguished Paper Award Technical Track Nadia Nahar Carnegie Mellon University, Shurui Zhou University of Toronto, Grace Lewis Carnegie Mellon Software Engineering Institute, Christian Kästner Carnegie Mellon University Pre-print Media Attached

Information for Participants

Mon 9 May 2022 20:00 - 21:00 at ICSE room 1 - Machine Learning with and for SE 4 Chair(s): Gias Uddin

Info for room ICSE room 1-even hours:

Click here to go to the room on Midspace

Thu 12 May 2022 13:00 - 14:00 at ICSE room 4 - Machine Learning with and for SE 12 Chair(s): Wei Yang

Info for room ICSE room 4-odd hours:

Click here to go to the room on Midspace