Deep Just-in-Time Defect Prediction: How Far Are We? (ISSTA 2021 - Technical Papers)

Who

Zhengran Zeng, Yuqun Zhang, Haotian Zhang, Lingming Zhang

Track

ISSTA 2021 Technical Papers

Time Zone

The program is currently displayed in (GMT+02:00) Brussels, Copenhagen, Madrid, Paris.

Use conference time zone: (GMT+02:00) Brussels, Copenhagen, Madrid, ParisSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 15 Jul 2021 20:00 - 20:20 at ISSTA 1 - Session 11 (time band 1) Machine Learning and Testing Chair(s): August Shi
Sat 17 Jul 2021 10:30 - 10:50 at ISSTA 1 - Session 27 (time band 3) Bugs and Analysis 2 Chair(s): Mike Papadakis

Abstract

Defect prediction aims to automatically identify potential defective code with minimal human intervention and has been widely studied in the literature. Just-in-Time (JIT) defect prediction focuses on program changes rather than whole programs, and has been widely adopted in continuous testing. CC2Vec, state-of-the-art JIT defect prediction tool, first constructs a hierarchical attention network (HAN) to learn distributed vector representations of both code additions and deletions, and then concatenates them with two other embedding vectors representing commit messages and overall code changes extracted by the existing DeepJIT approach to train a model for predicting whether a given commit is defective. Although CC2Vec has been shown to be the state of the art for JIT defect prediction, it was only evaluated on a limited dataset and not compared with all representative baselines. Therefore, to further investigate the efficacy and limitations of CC2Vec, this paper performs an extensive study of CC2Vec on a large-scale dataset with over 310,370 changes (8.3 X larger than the original CC2Vec dataset). More specifically, we also empirically compare CC2Vec against DeepJIT and representative traditional JIT defect prediction techniques. The experimental results show that CC2Vec cannot consistently outperform DeepJIT, and neither of them can consistently outperform traditional JIT defect prediction. We also investigate the impact of individual traditional defect prediction features and find that the added-line-number feature outperforms other traditional features. Inspired by this finding, we construct a simplistic JIT defect prediction approach which simply adopts the added-line-number feature with the logistic regression classifier. Surprisingly, such a simplistic approach can outperform CC2Vec and DeepJIT in defect prediction, and can be 81k X/120k X faster in training/testing. Furthermore, the paper also provides various practical guidelines for advancing JIT defect prediction in the near future.

DOI

https://doi.org/10.1145/3460319.3464819

Zhengran Zeng

Southern University of Science and Technology

China

Yuqun Zhang

Southern University of Science and Technology

China

Haotian Zhang

Kwai

China

Lingming Zhang

University of Illinois at Urbana-Champaign

United States

Time Zone

The program is currently displayed in (GMT+02:00) Brussels, Copenhagen, Madrid, Paris.

Use conference time zone: (GMT+02:00) Brussels, Copenhagen, Madrid, ParisSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 15 Jul
Displayed time zone: Brussels, Copenhagen, Madrid, Paris change

19:00 - 20:20	Session 11 (time band 1) Machine Learning and TestingTechnical Papers at ISSTA 1 Chair(s): August Shi University of Texas at Austin

19:00 20m Talk		Interval Constraint-Based Mutation Testing of Numerical Specifications Technical Papers Clothilde Jeangoudoux MPI-SWS, Eva Darulova MPI-SWS, Christoph Lauter University of Alaska at Anchorage DOI
19:20 20m Talk		Predoo: Precision Testing of Deep Learning Operators Technical Papers Xufan Zhang Nanjing University, Ning Sun Nanjing University, Chunrong Fang Nanjing University, Jiawei Liu Nanjing University, Jia Liu Nanjing University, Dong Chai Huawei, Jiang Wang Huawei, Zhenyu Chen Nanjing University DOI
19:40 20m Talk		TERA: Optimizing Stochastic Regression Tests in Machine Learning Projects Technical Papers Saikat Dutta University of Illinois at Urbana-Champaign, Jeeva Selvam University of Illinois at Urbana-Champaign, Aryaman Jain University of Illinois at Urbana-Champaign, Sasa Misailovic University of Illinois at Urbana-Champaign DOI
20:00 20m Talk		Deep Just-in-Time Defect Prediction: How Far Are We? Technical Papers Zhengran Zeng Southern University of Science and Technology, Yuqun Zhang Southern University of Science and Technology, Haotian Zhang Kwai, Lingming Zhang University of Illinois at Urbana-Champaign DOI

Sat 17 Jul
Displayed time zone: Brussels, Copenhagen, Madrid, Paris change

09:30 - 11:10	Session 27 (time band 3) Bugs and Analysis 2Technical Papers at ISSTA 1 Chair(s): Mike Papadakis University of Luxembourg, Luxembourg

09:30 20m Talk		Faster, Deeper, Easier: Crowdsourcing Diagnosis of Microservice Kernel Failure from User Space Technical Papers Yicheng Pan Peking University, Meng Ma Peking University, Xinrui Jiang Peking University, Ping Wang Peking University DOI Media Attached File Attached
09:50 20m Talk		Finding Data Compatibility Bugs with JSON Subschema CheckingDistinguished Artifact Technical Papers Andrew Habib SnT, University of Luxembourg, Avraham Shinnar IBM Research, Martin Hirzel IBM Research, Michael Pradel University of Stuttgart Link to publication DOI Pre-print File Attached
10:10 20m Talk		Semantic Table Structure Identification in Spreadsheets Technical Papers Yakun Zhang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Xiao Lv Microsoft Research, Haoyu Dong Microsoft Research, Wensheng Dou Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Shi Han Microsoft Research, Dongmei Zhang Microsoft Research, Jun Wei Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Dan Ye Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences DOI Media Attached
10:30 20m Talk		Deep Just-in-Time Defect Prediction: How Far Are We? Technical Papers Zhengran Zeng Southern University of Science and Technology, Yuqun Zhang Southern University of Science and Technology, Haotian Zhang Kwai, Lingming Zhang University of Illinois at Urbana-Champaign DOI
10:50 20m Talk		Continuous Test Suite Failure Prediction Technical Papers Cong Pan Beihang University, Michael Pradel University of Stuttgart DOI Media Attached