PyExplainer: Explaining the Predictions of Just-In-Time Defect Models
Just-In-Time (JIT) defect prediction (i.e., an AI/ML model to predict defect-introducing commits) is proposed to help developers prioritize their limited Software Quality Assurance (SQA) resources on the most risky commits. However, the explainability of JIT defect models remains largely unexplored (i.e., practitioners still do not know why a commit is predicted as defect-introducing). Recently, LIME has been used to generate explanations for any AI/ML models. However, the random perturbation approach used by LIME to generate synthetic neighbors is still suboptimal, i.e., generating synthetic neighbors that may not be similar to an instance to be explained, producing low accuracy of the local models, leading to inaccurate explanations for just-in-time defect models.
In this paper, we propose PyExplainer—i.e., a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of JIT defect models. Through a case study of two open-source software projects, we find that our PyExplainer produces (1) synthetic neighbors that are 41%-45% more similar to an instance to be explained; (2) 18%-38% more accurate local models; and (3) explanations that are 69%-98% more unique and 17%-54% more consistent with the actual characteristics of defect-introducing commits in the future than LIME (a state-of-the-art model-agnostic technique). This could help practitioners focus on the most important aspects of the commits to mitigate the risk of being defect-introducing. Thus, the contributions of this paper build an important step towards Explainable AI for Software Engineering, making software analytics more explainable and actionable. Finally, we publish our PyExplainer as a Python package to support practitioners and researchers.
Wed 17 NovDisplayed time zone: Hobart change
11:00 - 12:00 | Finding DefectsResearch Papers / NIER track / Journal-first Papers at Kangaroo Chair(s): Xiao Liu School of Information Technology, Deakin University | ||
11:00 20mTalk | Graph-based Incident Aggregation for Large-Scale Online Service Systems Research Papers Zhuangbin Chen Chinese University of Hong Kong, China, Yuxin Su The Chinese University of Hong Kong, Jinyang Liu , Hongyu Zhang University of Newcastle, Xuemin Wen Huawei Technologies, Xiao Ling Huawei Technologies, Yongqiang Yang Huawei Technologies, Michael Lyu The Chinese University of Hong Kong | ||
11:20 20mTalk | PyExplainer: Explaining the Predictions of Just-In-Time Defect Models Research Papers Chanathip Pornprasit Monash University, Kla Tantithamthavorn Monash University, Jirayus Jiarpakdee Monash University, Australia, Michael Fu Monash University, Patanamon Thongtanunam University of Melbourne | ||
11:40 10mTalk | Towards Systematic and Dynamic Task Allocation for Collaborative Parallel Fuzzing NIER track Thuan Pham The University of Melbourne, Manh-Dung Nguyen Montimage R&D, France, Quang-Trung Ta National University of Singapore, Toby Murray University of Melbourne, Benjamin I.P. Rubinstein University of Melbourne | ||
11:50 10mTalk | An Extensive Study on Smell-Aware Bug Localization Journal-first Papers Aoi Takahashi Tokyo Institute of Technology, Natthawute Sae-Lim Tokyo Institute of Technology, Shinpei Hayashi Tokyo Institute of Technology, Motoshi Saeki Nanzan University Link to publication DOI |