FlakyCat: Predicting Flaky Tests Categories using Few-Shot Learning
Flaky tests are tests that yield different outcomes when run on the same version of a program. This non- deterministic behaviour plagues continuous integration with false signals, wasting developers’ time and reducing their trust in test suites. Studies highlighted the importance of keeping tests flakiness-free. Recently, the research community has been pushing towards the detection of flaky tests by suggesting many static and dynamic approaches. While promising, those approaches mainly focus on classifying tests as flaky or not and, even when high performances are reported, it remains challenging to understand the cause of flakiness. This part is crucial for researchers and developers that aim to fix it. To help with the comprehension of a given flaky test, we propose FlakyCat, the first approach to classify flaky tests based on their root cause category. FlakyCat relies on CodeBERT for code representation and leverages Siamese networks to train a multi-class classifier. We train and evaluate FlakyCat on a set of 451 flaky tests collected from open-source Java projects. Our evaluation shows that FlakyCat categorises flaky tests accurately, with an F1 score of 73%. Furthermore, we investigate the performance of our approach for each category, revealing that Async waits, Unordered collections and Time-related flaky tests are accurately classified, while Concurrency-related flaky tests are more challenging to predict. Finally, to facilitate the comprehension of FlakyCat’s predictions, we present a new technique for CodeBERT-based model interpretability that highlights code statements influencing the categorization.
Tue 16 MayDisplayed time zone: Hobart change
11:00 - 12:30 | |||
11:00 22mTalk | On the Effect of Instrumentation on Test Flakiness AST 2023 Shawn Rasheed Universal College of Learning, Jens Dietrich Victoria University of Wellington, Amjed Tahir Massey University Pre-print | ||
11:22 22mTalk | Debugging Flaky Tests using Spectrum-based Fault Localization AST 2023 Pre-print | ||
11:45 22mTalk | FlakyCat: Predicting Flaky Tests Categories using Few-Shot Learning AST 2023 Amal Akli University of Luxembourg, Guillaume Haben University of Luxembourg, Sarra Habchi Ubisoft, Mike Papadakis University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg | ||
12:07 22mTalk | Detecting Potential User-data Save & Export Losses due to Android App Termination AST 2023 Sydur Rahaman New Jersey Institute of Technology, Umar Farooq University of California at Riverside, Iulian Neamtiu New Jersey Institute of Technology, Zhijia Zhao University of California at Riverside |