ASE 2025
Sun 16 - Thu 20 November 2025 Seoul, South Korea

This program is tentative and subject to change.

Mon 17 Nov 2025 16:40 - 16:50 at Grand Hall 3 - Autonomous Systems

Time series anomaly detection (TSAD) is essential for ensuring the safety and reliability of aerospace software systems. Although large language models (LLMs) provide a promising training-free alternative to unsupervised approaches, their effectiveness in aerospace settings remains under-examined because of complex telemetry, misaligned evaluation metrics, and the absence of domain knowledge. To address this gap, we introduce ATSADBench, the first benchmark for aerospace TSAD. ATSADBench comprises nine tasks that combine three pattern-wise anomaly types, univariate and multivariate signals, and both in-loop and out-of-loop feedback scenarios, yielding 108,000 data points. Using this benchmark, we systematically evaluate state-of-the-art open-source LLMs under two paradigms: Direct, which labels anomalies within sliding windows, and Prediction-based, which detects anomalies from prediction errors. To reflect operational needs, we reformulate evaluation at the window level and propose three user-oriented metrics: Alarm Accuracy (AA), Alarm Latency (AL), and Alarm Contiguity (AC), which quantify alarm correctness, timeliness, and credibility. We further examine two enhancement strategies, few-shot learning and retrieval-augmented generation (RAG), to inject domain knowledge. The evaluation results show that (1) LLMs perform well on univariate tasks but struggle with multivariate telemetry, (2) their AA and AC on multivariate tasks approach random guessing, (3) few-shot learning provides modest gains whereas RAG offers no significant improvement, and (4) in practice LLMs can detect true anomaly onsets yet sometimes raise false alarms, which few-shot prompting mitigates but RAG exacerbates. These findings offer guidance for future LLM-based TSAD in aerospace software.

This program is tentative and subject to change.

Mon 17 Nov

Displayed time zone: Seoul change

16:00 - 17:00
16:00
10m
Talk
Human-In-The-Loop Oracle Learning for Simulation-Based Testing
NIER Track
Ben-Hau Chia Carnegie Mellon University, Eunsuk Kang Carnegie Mellon University, Christopher Steven Timperley Carnegie Mellon University
16:10
10m
Talk
Taming Uncertainty via Automation: Observing, Analyzing, and Optimizing Agentic AI Systems
NIER Track
Dany Moshkovich IBM Research, Sergey Zeltyn IBM Research
16:20
10m
Talk
Out of Distribution Detection in Self-adaptive Robots with AI-powered Digital Twins
Industry Showcase
Erblin Isaku Simula Research Laboratory, and University of Oslo (UiO), Hassan Sartaj Simula Research Laboratory, Shaukat Ali Simula Research Laboratory and Oslo Metropolitan University, Beatriz Sanguino Norwegian University of Science and Technology, Tongtong Wang Norwegian University of Science and Technology, Guoyuan Li Norwegian University of Science and Technology, Houxiang Zhang Norwegian University of Science and Technology, Thomas Peyrucain PAL Robotics
16:30
10m
Talk
Unseen Data Detection using Routing Entropy in Mixture-of-Experts for Autonomous Vehicles
NIER Track
Sang In Lee Chungnam Naitional University, Donghwan Shin University of Sheffield, Jihun Park Chungnam National University
Pre-print
16:40
10m
Talk
Evaluating Large Language Models for Time Series Anomaly Detection in Aerospace Software
Industry Showcase
Yang Liu Beijing Institute of Control Engineering, Yixing Luo Beijing Institute of Control Engineering, Xiaofeng Li Beijing Institute of Control Engineering, Xiaogang Dong Beijing Institute of Control Engineering, Bin Gu Beijing Institute of Control Engineering, Zhi Jin Peking University
16:50
10m
Talk
Bridging Research and Practice in Simulation-based Testing of Industrial Robot Navigation Systems
Industry Showcase
Sajad Khatiri Università della Svizzera italiana and University of Bern, Francisco Eli Vi˜na Barrientos ANYbotics AG, Maximilian Wulf ANYbotics AG, Paolo Tonella USI Lugano, Sebastiano Panichella University of Bern