Write a Blog >>
ICSE 2021
Mon 17 May - Sat 5 June 2021

The SZZ algorithm for identifying bug-inducing changes has been widely used to evaluate defect prediction techniques and to empirically investigate when, how, and by whom bugs are introduced. Over the years, researchers have proposed several heuristics to improve the SZZ accuracy, providing various implementations of SZZ. However, fairly evaluating those implementations on a reliable oracle is an open problem: SZZ evaluations usually rely on (i) the manual analysis of the SZZ output to classify the identified bug-inducing commits as true or false positives; or (ii) a golden set linking bug-fixing and bug-inducing commits. In both cases, these manual evaluations are performed by researchers with limited knowledge of the studied subject systems. Ideally, there should be a golden set created by the original developers of the studied systems.

We propose a methodology to build a “developer-informed” oracle for the evaluation of SZZ variants. We use Natural Language Processing (NLP) to identify bug-fixing commits in which developers explicitly reference the commit(s) that introduced a fixed bug. This was followed by a manual filtering step aimed at ensuring the quality and accuracy of the oracle. Once built, we used the oracle to evaluate several widely used variants of the SZZ algorithm in terms of their accuracy. Our evaluation helped us to distill a set of important insights and lessons learned to further improve the SZZ algorithm.

Conference Day
Wed 26 May

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:30 - 15:30
2.3.1. Defect Prediction: Automation #1Technical Track / SEIP - Software Engineering in Practice at Blended Sessions Room 1 +12h
Chair(s): Carolyn SeamanUniversity of Maryland Baltimore County
14:30
20m
Paper
Automatic Web Testing using Curiosity-Driven Reinforcement LearningTechnical Track
Technical Track
YAN ZHENGNanyang Technological University, Yi LiuSouthern University of Science and Technology, Xiaofei XieNanyang Technological University, Yepang LiuSouthern University of Science and Technology, China, Lei MaUniversity of Alberta, Jianye HaoTianjin University, Yang LiuNanyang Technological University
Pre-print Media Attached
14:50
20m
Paper
Evaluating SZZ Implementations Through a Developer-informed OracleTechnical Track
Technical Track
Giovanni RosaUniversity of Molise, Luca PascarellaUniversità della Svizzera italiana (USI), Simone ScalabrinoUniversity of Molise, Rosalia TufanoUniversità della Svizzera Italiana, Gabriele BavotaSoftware Institute, USI Università della Svizzera italiana, Michele LanzaSoftware Institute, USI Università della Svizzera italiana, Rocco OlivetoUniversity of Molise
Pre-print Media Attached
15:10
20m
Paper
D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential AnalysisSEIP
SEIP - Software Engineering in Practice
Yunhui ZhengIBM Research, Saurabh PujarIBM Research, Burn LewisIBM Research, Luca BurattiIBM Research, Edward EpsteinIBM Research, Bo YangIBM Research, Jim A. LaredoIBM Research, USA, Alessandro MorariIBM Research, Zhong SuIBM Research
Pre-print Media Attached

Conference Day
Thu 27 May

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

02:30 - 03:30
02:30
20m
Paper
Automatic Web Testing using Curiosity-Driven Reinforcement LearningTechnical Track
Technical Track
YAN ZHENGNanyang Technological University, Yi LiuSouthern University of Science and Technology, Xiaofei XieNanyang Technological University, Yepang LiuSouthern University of Science and Technology, China, Lei MaUniversity of Alberta, Jianye HaoTianjin University, Yang LiuNanyang Technological University
Pre-print Media Attached
02:50
20m
Paper
Evaluating SZZ Implementations Through a Developer-informed OracleTechnical Track
Technical Track
Giovanni RosaUniversity of Molise, Luca PascarellaUniversità della Svizzera italiana (USI), Simone ScalabrinoUniversity of Molise, Rosalia TufanoUniversità della Svizzera Italiana, Gabriele BavotaSoftware Institute, USI Università della Svizzera italiana, Michele LanzaSoftware Institute, USI Università della Svizzera italiana, Rocco OlivetoUniversity of Molise
Pre-print Media Attached
03:10
20m
Paper
D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential AnalysisSEIP
SEIP - Software Engineering in Practice
Yunhui ZhengIBM Research, Saurabh PujarIBM Research, Burn LewisIBM Research, Luca BurattiIBM Research, Edward EpsteinIBM Research, Bo YangIBM Research, Jim A. LaredoIBM Research, USA, Alessandro MorariIBM Research, Zhong SuIBM Research
Pre-print Media Attached