Fault Localization via Efficient Probabilistic Modeling of Program Semantics
Fri 13 May 2022 04:10 - 04:15 at ICSE room 2-even hours - Fault Localization Chair(s): Arosha K Bandara
Testing-based fault localization has been a significant topic in software engineering in the past decades. It localizes a faulty program element based on a set of passing and failing test executions. Since whether a fault could be triggered and detected by a test is related to program semantics, it is crucial to model program semantics in fault localization approaches. Existing approaches either consider the full semantics of the program (e.g., mutation-based fault localization and angelic debugging), leading to scalability issues, or ignore the semantics of the program (e.g., spectrum-based fault localization), leading to imprecise localization result.
Our key idea is: by modeling only the correctness of program values but not their full semantics, a balance could be reached between effectiveness and scalability. To realize this idea, we introduce a probabilistic approach to model program semantics and utilize information from static analysis and dynamic execution traces in our modeling. Our approach, SmartFL, is evaluated on a real-world dataset, Defects4J. The top-1 statement-level accuracy of our approach is 21%, which is the best among state-of-the-art methods. The average time cost is 210 seconds per fault, while existing methods that capture full semantics are often 10x or more slower.