Finding Safety Violations of AI-Enabled Control Systems through the Lens of Synthesized Proxy Programs
This program is tentative and subject to change.
Context and Problem Statement. Given the increasing adoption of modern AI-enabled control systems (e.g., autonomous vehicles, drones), ensuring their safety and reliability has become a critical task in software testing. One prevalent approach to testing control systems is falsification, which aims to find an input signal that causes the control system to violate a formal safety specification using optimization algorithms. However, applying falsification to AI-enabled control systems poses two significant challenges: (1) it requires the system to execute numerous candidate test inputs, which can be time-consuming, particularly for systems with AI models that have many parameters, and (2) multiple safety requirements are typically defined as a conjunctive specification, which existing falsification methods struggle to cover comprehensively. In such contexts, developing a falsification framework tailored for AI-enabled control systems is of paramount importance.
Methodology. This paper introduces Synthify, a falsification framework tailored for AI-enabled control systems, i.e., control systems equipped with AI controllers. Our approach performs falsification in a two-phase process. At the start, Synthify synthesizes a program that implements one or a few linear controllers to serve as a proxy for the AI controller. This proxy program mimics the AI controller’s functionality but is computationally more efficient. Then, Synthify employs the 𝜖-greedy strategy to sample a promising sub-specification from the conjunctive safety specification. It then uses a Simulated Annealing-based falsification algorithm to find violations of the sampled sub-specification for the control system. If a violation is spurious, Synthify refines the proxy program to perform more similarly to the original AI controller. Otherwise, it terminates the falsification process and report the found violations.
Evaluation. We compare Synthify to PSY-TaLiRo, a state-of-the-art and industrial-strength falsification tool, on 8 publicly available control systems. On average, Synthify achieves a 83.5% higher success rate in falsification compared to PSY-TaLiRo with the same budget of falsification trial, and reveals 7.8Ă— more safety violations than PSY-TaLiRo within the same time budget. Additionally, our method is 12.8Ă— faster in finding a single safety violation than the baseline. The safety violations found by Synthify are also more diverse than those found by PSY-TaLiRo, covering 137.7% more sub-specifications. The implementation of Synthify, along with datasets and raw results, has been made available at https://github.com/soarsmu/Synthify.
Future Work. Our work highlights several promising directions for future research, such as extending Synthify to AI-enabled control systems that handle more complex inputs like images. Another promising avenue is to integrate Synthify with other safety analysis techniques, such as runtime shields, to further enhance the reliability of AI-enabled control systems.
This paper has been accepted for publication in ACM Transactions on Software Engineering and Methodology (TOSEM) on January 14, 2025, and is available at https://doi.org/10.1145/3715105. Jieke Shi will be the presenting author.
This program is tentative and subject to change.
Tue 18 NovDisplayed time zone: Seoul change
14:00 - 15:30 | |||
14:00 10mTalk | Quantum Circuit Mutants: Empirical Analysis and Recommendations Journal-First Track Eñaut Mendiluze Usandizaga Simula Research Laboratory, Norway, Shaukat Ali Simula Research Laboratory and Oslo Metropolitan University, Tao Yue Beihang University, Paolo Arcaini National Institute of Informatics
| ||
14:10 10mTalk | MET-MAPF: A Metamorphic Testing Approach for Multi-Agent Path Finding Algorithms Journal-First Track Xiao-Yi Zhang University of Science and Technology Beijing, Yang Liu Nanyang Technological University, Paolo Arcaini National Institute of Informatics
, Mingyue Jiang Zhejiang Sci-Tech University, Zheng Zheng Beihang University | ||
14:20 10mTalk | State Field Coverage: A Metric for Oracle Quality Research Papers Facundo Molina IMDEA Software Institute, Nazareno Aguirre University of Rio Cuarto and CONICET, Alessandra Gorla IMDEA Software Institute | ||
14:30 10mTalk | Do LLMs Generate Useful Test Oracles? An Empirical Study with an Unbiased Dataset Research Papers Davide Molinelli USI Lugano; Schaffhausen Institute of Technology, Luca Di Grazia University of St. Gallen, Alberto Martin-Lopez Software Institute - USI, Lugano, Michael D. Ernst University of Washington, Mauro Pezze UniversitĂ della Svizzera italiana (USI) and UniversitĂ degli Studi di Milano Bicocca | ||
14:40 10mTalk | Finding Safety Violations of AI-Enabled Control Systems through the Lens of Synthesized Proxy Programs Journal-First Track Jieke Shi Singapore Management University, Zhou Yang University of Alberta, Alberta Machine Intelligence Institute , Junda He Singapore Management University, Bowen Xu North Carolina State University, Dongsun Kim Korea University, DongGyun Han Royal Holloway, University of London, David Lo Singapore Management University Link to publication DOI Pre-print | ||
14:50 10mTalk | ZendDiff: Differential Testing of PHP Interpreter Research Papers Yuancheng Jiang National University of Singapore, Jianing Wang National University of Singapore, Qiange Liu Beihang University, Yeqi Fu National University of Singapore, Jian Mao Beihang University, Roland H. C. Yap National University of Singapore, Zhenkai Liang National University of Singapore | ||
15:00 10mTalk | SATORI: Static Test Oracle Generation for REST APIs Research Papers Juan C. Alonso Universidad de Sevilla, Alberto Martin-Lopez Software Institute - USI, Lugano, Sergio Segura SCORE Lab, I3US Institute, Universidad de Sevilla, Seville, Spain, Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Antonio Ruiz-Cortés University of Seville | ||
15:10 10mTalk | Exact Inference for Quantum Circuits: A Testing Oracle for Quantum Software Stacks Research Papers | ||
15:20 10mTalk | Identifying inconsistent software defect predictions with symmetry metamorphic relation pattern Journal-First Track Chan Pak Yuen Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong, China, Jacky Keung City University of Hong Kong, Zhen Yang Shandong University | ||