SANER 2025
Tue 4 - Fri 7 March 2025 Montréal, Québec, Canada

Despite advances in high-performance Software Defect Prediction (SDP) models, practitioners remain hesitant to adopt them due to opaque decision-making and a lack of actionable insights. Recent research has applied various explainable AI (XAI) techniques to provide explainable and actionable guidance for SDP results to address these limitations, but the trustworthiness of such guidance for practitioners has not been sufficiently investigated. Practitioners may question the feasibility of implementing the proposed changes, and if these changes fail to resolve predicted defects or prove inaccurate, their trust in the guidance may diminish. In this study, we empirically evaluate the effectiveness of current XAI approaches for SDP across 32 releases of 9 large-scale projects, focusing on whether the guidance meets practitioners’ expectations. Our findings reveal that their actionable guidance (i) does not guarantee that predicted defects are resolved; (ii) fails to pinpoint modifications required to resolve predicted defects; and (iii) deviates from the typical code changes practitioners make in their projects. These limitations indicate that the guidance is not yet reliable enough for developers to justify investing their limited debugging resources. We suggest that future XAI research for SDP incorporate feedback loops that offer clear rewards for practitioners’ efforts, and propose a potential alternative approach utilizing counterfactual explanations.