Learning to Recognize Actionable Static Code Warnings (is Intrinsically Easy) (ICSE 2022 - Journal-First Papers)

Write a Blog >>

Sun 8 - Fri 27 May 2022

Who

Xueqi Yang, Jianfeng Chen , Rahul Yedida, Zhe Yu, Tim Menzies

Track

ICSE 2022 Journal-First Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 9 May 2022 20:05 - 20:10 at ICSE room 1-even hours - Machine Learning with and for SE 4 Chair(s): Gias Uddin
Thu 12 May 2022 13:10 - 13:15 at ICSE room 4-odd hours - Machine Learning with and for SE 12 Chair(s): Wei Yang
Fri 27 May 2022 11:20 - 11:25 at Room 301+302 - Papers 19: Machine Learning with and for SE 2 Chair(s): Dalal Alrajeh

Abstract

Static code warning tools often generate warnings that programmers ignore. Such tools can be made more useful via data mining algorithms that select the “actionable” warnings; i.e. the warnings that are usually not ignored.

In this paper, we look for actionable warnings within a sample of 5,675 actionable warnings seen in 31,058 static code warnings from FindBugs. We find that data mining algorithms can find actionable warnings with remarkable ease. Specifically, a range of data mining methods (deep learners, random forests, decision tree learners, and support vector machines) all achieved very good results (recalls and AUC (TRN, TPR) measures usually over 95% and false alarms usually under 5%).

Given that all these learners succeeded so easily, it is appropriate to ask if there is something about this task that is inherently easy. We report that while our data sets have up to 58 raw features, those features can be approximated by less than two underlying dimensions. For such intrinsically simple data, many different kinds of learners can generate useful models with similar performance.

Based on the above, we conclude that learning to recognize actionable static code warnings is easy, using a wide range of learning algorithms, since the underlying data is intrinsically simple. If we had to pick one particular learner for this task, we would suggest linear SVMs (since, at least in our sample, that learner ran relatively quickly and achieved the best median performance) and we would not recommend deep learning (since this data is intrinsically very simple).

Link to Publication

https://link.springer.com/article/10.1007/s10664-021-09948-6

Link to Preprint

https://arxiv.org/pdf/2006.00444.pdf

DOI

https://doi.org/10.1007/s10664-021-09948-6

Xueqi Yang

NCSU

United States

Jianfeng Chen

North Carolina State University

Rahul Yedida

North Carolina State University

Zhe Yu

Tim Menzies

North Carolina State University

United States

slides for Learning to Recognize Actionable Static Code Warnings (is Intrinsically Easy)

video for Learning to Recognize Actionable Static Code Warnings (is Intrinsically Easy)

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 9 May
Displayed time zone: Eastern Time (US & Canada) change

20:00 - 21:00	Machine Learning with and for SE 4Journal-First Papers / Technical Track / SEIP - Software Engineering in Practice at ICSE room 1-even hours Chair(s): Gias Uddin University of Calgary, Canada

20:00 5m Talk		Revisiting Process versus Product Metrics: a Large Scale Analysi Journal-First Papers Suvodeep Majumder North Carolina State University, Pranav Mody North Carolina State University, Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached File Attached
20:05 5m Talk		Learning to Recognize Actionable Static Code Warnings (is Intrinsically Easy) Journal-First Papers Xueqi Yang NCSU, Jianfeng Chen North Carolina State University, Rahul Yedida North Carolina State University, Zhe Yu , Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached
20:10 5m Talk		Mining Root Cause Knowledge from Cloud Service Incident Investigations for AIOps SEIP - Software Engineering in Practice Amrita Saha Salesforce Research Asia, Steven C.H. Hoi Salesforce Research Asia Pre-print Media Attached
20:15 5m Talk		FairNeuron: Improving Deep Neural Network Fairness with Adversary Games on Selective Neurons Technical Track Xuanqi Gao Xi'an Jiaotong University, Juan Zhai Rutgers University, Shiqing Ma Rutgers University, Chao Shen Xi'an Jiaotong University, Yufei Chen Xi'an Jiaotong University, Qian Wang Wuhan University DOI Pre-print Media Attached
20:20 5m Talk		EREBA: Black-box Energy Testing of Adaptive Neural Networks Technical Track Mirazul Haque UT Dallas, Yaswanth Yadlapalli University of Texas at Dallas, Wei Yang University of Texas at Dallas, Cong Liu University of Texas at Dallas, USA Pre-print Media Attached
20:25 5m Talk		Training Data Debugging for the Fairness of Machine Learning Software Technical Track Yanhui Li Department of Computer Science and Technology, Nanjing University, Linghan Meng Nanjing University, Lin Chen Department of Computer Science and Technology, Nanjing University, Li Yu Nanjing University, Di Wu Momenta, Yuming Zhou Nanjing University, Baowen Xu Nanjing University Pre-print Media Attached

Thu 12 May
Displayed time zone: Eastern Time (US & Canada) change

13:00 - 14:00	Machine Learning with and for SE 12Journal-First Papers / Technical Track / NIER - New Ideas and Emerging Results at ICSE room 4-odd hours Chair(s): Wei Yang University of Texas at Dallas

13:00 5m Talk		Modeling Functional Similarity in Source Code with Graph-Based Siamese Networks Journal-First Papers NIKITA MEHROTRA Indraprastha Institute of Information Technology, NAVDHA AGARWAL Indraprastha Institute of Information Technology, Delhi, PIYUSH GUPTA Indraprastha Institute of Information Technology, Delhi, SAKET ANAND Indraprastha Institute of Information Technology, Delhi, David Lo Singapore Management University, Rahul Purandare IIIT-Delhi Link to publication DOI Media Attached
13:05 5m Talk		Revisiting Process versus Product Metrics: a Large Scale Analysi Journal-First Papers Suvodeep Majumder North Carolina State University, Pranav Mody North Carolina State University, Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached File Attached
13:10 5m Talk		Learning to Recognize Actionable Static Code Warnings (is Intrinsically Easy) Journal-First Papers Xueqi Yang NCSU, Jianfeng Chen North Carolina State University, Rahul Yedida North Carolina State University, Zhe Yu , Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached
13:15 5m Talk		Improving the Learnability of Machine Learning APIs by Semi-Automated API Wrapping NIER - New Ideas and Emerging Results Lars Reimann University of Bonn, Günter Kniesel-Wünsche University of Bonn DOI Pre-print Media Attached
13:20 5m Talk		Improving Machine Translation Systems via Isotopic Replacement Technical Track Zeyu Sun Peking University, Jie M. Zhang King's College London, Yingfei Xiong Peking University, Mark Harman University College London, Mike Papadakis University of Luxembourg, Luxembourg, Lu Zhang Peking University Pre-print Media Attached
13:25 5m Talk		Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and ProcessDistinguished Paper Award Technical Track Nadia Nahar Carnegie Mellon University, Shurui Zhou University of Toronto, Grace Lewis Carnegie Mellon Software Engineering Institute, Christian Kästner Carnegie Mellon University Pre-print Media Attached

Fri 27 May
Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30	Papers 19: Machine Learning with and for SE 2Journal-First Papers / Technical Track at Room 301+302 Chair(s): Dalal Alrajeh Imperial College London

11:00 5m Talk		Defect Reduction Planning (using TimeLIME) Journal-First Papers Kewen Peng North Carolina State University, Tim Menzies North Carolina State University Authorizer link Pre-print Media Attached
11:05 5m Talk		VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning Technical Track Qibin Chen Carnegie Mellon University, Jeremy Lacomis Carnegie Mellon University, Edward J. Schwartz Carnegie Mellon University Software Engineering Institute, Graham Neubig Carnegie Mellon University, Bogdan Vasilescu Carnegie Mellon University, USA, Claire Le Goues Carnegie Mellon University DOI Pre-print Media Attached
11:10 5m Talk		EREBA: Black-box Energy Testing of Adaptive Neural Networks Technical Track Mirazul Haque UT Dallas, Yaswanth Yadlapalli University of Texas at Dallas, Wei Yang University of Texas at Dallas, Cong Liu University of Texas at Dallas, USA Pre-print Media Attached
11:15 5m Talk		Multilingual training for Software Engineering Technical Track Toufique Ahmed University of California at Davis, Prem Devanbu Department of Computer Science, University of California, Davis DOI Pre-print Media Attached
11:20 5m Talk		Learning to Recognize Actionable Static Code Warnings (is Intrinsically Easy) Journal-First Papers Xueqi Yang NCSU, Jianfeng Chen North Carolina State University, Rahul Yedida North Carolina State University, Zhe Yu , Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached
11:25 5m Talk		Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and ProcessDistinguished Paper Award Technical Track Nadia Nahar Carnegie Mellon University, Shurui Zhou University of Toronto, Grace Lewis Carnegie Mellon Software Engineering Institute, Christian Kästner Carnegie Mellon University Pre-print Media Attached
11:30 5m Talk		Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection Journal-First Papers Nadia Daoudi SnT, University of Luxembourg, Kevin Allix University of Luxembourg, Tegawendé F. Bissyandé SnT, University of Luxembourg, Jacques Klein University of Luxembourg Link to publication Pre-print Media Attached

Information for Participants

Mon 9 May 2022 20:00 - 21:00 at ICSE room 1-even hours - Machine Learning with and for SE 4 Chair(s): Gias Uddin

Info for room ICSE room 1-even hours:

Click here to go to the room on Midspace

Thu 12 May 2022 13:00 - 14:00 at ICSE room 4-odd hours - Machine Learning with and for SE 12 Chair(s): Wei Yang

Info for room ICSE room 4-odd hours:

Click here to go to the room on Midspace