Taming Uncertainty via Automation: Observing, Analyzing, and Optimizing Agentic AI Systems
This program is tentative and subject to change.
Large Language Models (LLMs) are increasingly deployed within agentic systems—collections of interacting, LLM-powered agents that execute complex, adaptive workflows using memory, tools, and dynamic planning. While enabling powerful new capabilities, these systems also introduce unique forms of uncertainty stemming from probabilistic reasoning, evolving memory states, and fluid execution paths. Traditional software observability and operations practices fall short in addressing these challenges.
This paper introduces AgentOps: a comprehensive framework for observing, analyzing, optimizing, and automating operation of agentic AI systems. We identify distinct needs across four key roles—developers, testers, site reliability engineers (SREs), and business users—each of whom engages with the system at different points in its lifecycle. We present the AgentOps Automation Pipeline, a six-stage process encompassing behavior observation, metric collection, issue detection, root cause analysis, optimized recommendations, and runtime automation. Throughout, we emphasize the critical role of automation in managing uncertainty and enabling self-improving AI systems—not by eliminating uncertainty, but by taming it to ensure safe, adaptive, and effective operation.
This program is tentative and subject to change.
Mon 17 NovDisplayed time zone: Seoul change
16:00 - 17:00 | |||
16:00 10mTalk | Human-In-The-Loop Oracle Learning for Simulation-Based Testing NIER Track Ben-Hau Chia Carnegie Mellon University, Eunsuk Kang Carnegie Mellon University, Christopher Steven Timperley Carnegie Mellon University | ||
16:10 10mTalk | Taming Uncertainty via Automation: Observing, Analyzing, and Optimizing Agentic AI Systems NIER Track | ||
16:20 10mTalk | Out of Distribution Detection in Self-adaptive Robots with AI-powered Digital Twins Industry Showcase Erblin Isaku Simula Research Laboratory, and University of Oslo (UiO), Hassan Sartaj Simula Research Laboratory, Shaukat Ali Simula Research Laboratory and Oslo Metropolitan University, Beatriz Sanguino Norwegian University of Science and Technology, Tongtong Wang Norwegian University of Science and Technology, Guoyuan Li Norwegian University of Science and Technology, Houxiang Zhang Norwegian University of Science and Technology, Thomas Peyrucain PAL Robotics | ||
16:30 10mTalk | Unseen Data Detection using Routing Entropy in Mixture-of-Experts for Autonomous Vehicles NIER Track Sang In Lee Chungnam Naitional University, Donghwan Shin University of Sheffield, Jihun Park Chungnam National University Pre-print | ||
16:40 10mTalk | Evaluating Large Language Models for Time Series Anomaly Detection in Aerospace Software Industry Showcase Yang Liu Beijing Institute of Control Engineering, Yixing Luo Beijing Institute of Control Engineering, Xiaofeng Li Beijing Institute of Control Engineering, Xiaogang Dong Beijing Institute of Control Engineering, Bin Gu Beijing Institute of Control Engineering, Zhi Jin Peking University | ||
16:50 10mTalk | Bridging Research and Practice in Simulation-based Testing of Industrial Robot Navigation Systems Industry Showcase Sajad Khatiri Università della Svizzera italiana and University of Bern, Francisco Eli Vi˜na Barrientos ANYbotics AG, Maximilian Wulf ANYbotics AG, Paolo Tonella USI Lugano, Sebastiano Panichella University of Bern | ||