TCSE logo 
 Sigsoft logo
Sustainability badge
Fri 2 May 2025 14:45 - 15:00 at 215 - SE for AI with Quality 2 Chair(s): Romina Spalazzese

Simulators are widely used to test Autonomous Driving Systems (ADS), but their potential flakiness can lead to inconsistent test results. We investigate test flakiness in simulation-based testing of ADS by addressing two key questions: (1) How do flaky ADS simulations impact automated testing that relies on randomized algorithms? and (2) Can machine learning (ML) effectively identify flaky ADS tests while decreasing the required number of test reruns? Our empirical results, obtained from two widely-used open-source ADS simulators and five diverse ADS test setups, show that test flakiness in ADS is a common occurrence and can significantly impact the test results obtained by randomized algorithms. Further, our ML classifiers effectively identify flaky ADS tests using only a single test run, achieving F1-scores of 85%, 82% and 96% for three different ADS test setups. Our classifiers significantly outperform our non-ML baseline, which requires executing tests at least twice, by 31%, 21%, and 13% in F1-score performance, respectively. We conclude with a discussion on the scope, implications and limitations of our study. We provide our complete replication package in a Github repository (Github paper 2023).

Fri 2 May

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30
SE for AI with Quality 2Journal-first Papers at 215
Chair(s): Romina Spalazzese Malmö University
14:00
15m
Talk
Beyond Accuracy: An Empirical Study on Unit Testing in Open-source Deep Learning ProjectsSE for AI
Journal-first Papers
Han Wang Monash University, Sijia Yu Jilin University, Chunyang Chen TU Munich, Burak Turhan University of Oulu, Xiaodong Zhu Jilin University
Link to publication DOI Pre-print
14:15
15m
Talk
Boundary State Generation for Testing and Improvement of Autonomous Driving SystemsSE for AI
Journal-first Papers
Matteo Biagiola Università della Svizzera italiana, Paolo Tonella USI Lugano
DOI Pre-print
14:30
15m
Talk
D3: Differential Testing of Distributed Deep Learning with Model GenerationSE for AI
Journal-first Papers
Jiannan Wang Purdue University, Hung Viet Pham York University, Qi Li , Lin Tan Purdue University, Yu Guo Meta Inc., Adnan Aziz Meta Inc., Erik Meijer
14:45
15m
Talk
Evaluating the Impact of Flaky Simulators on Testing Autonomous Driving SystemsSE for AI
Journal-first Papers
Mohammad Hossein Amini University of Ottawa, Shervin Naseri University of Ottawa, Shiva Nejati University of Ottawa
15:00
15m
Talk
Reinforcement Learning for Online Testing of Autonomous Driving Systems: a Replication and Extension StudySE for AI
Journal-first Papers
Luca Giamattei Università di Napoli Federico II, Matteo Biagiola Università della Svizzera italiana, Roberto Pietrantuono Università di Napoli Federico II, Stefano Russo Università di Napoli Federico II, Paolo Tonella USI Lugano
DOI Pre-print
15:15
15m
Talk
Two is Better Than One: Digital Siblings to Improve Autonomous Driving TestingSE for AI
Journal-first Papers
Matteo Biagiola Università della Svizzera italiana, Andrea Stocco Technical University of Munich, fortiss, Vincenzo Riccio University of Udine, Paolo Tonella USI Lugano
DOI Pre-print
:
:
:
: