AgentFM: Role-Aware Failure Management for Distributed Databases with LLM-Driven Multi-Agents
Distributed databases are critical infrastructures for today’s large-scale software systems, making effective failure management essential to ensure software availability. However, existing approaches often overlook the role distinctions within distributed databases and rely on small-scale models with limited generalization capabilities. In this paper, we conduct a preliminary empirical study to emphasize the unique significance of different roles. Building on this insight, we propose AgentFM, a role-aware failure management framework for distributed databases powered by LLM-driven multi-agents. AgentFM addresses failure management by considering system roles, data roles, and task roles, with a meta-agent orchestrating these components. Preliminary evaluations using Apache IoTDB demonstrate the effectiveness of AgentFM and open new directions for further research.
Tue 24 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
16:00 - 17:40 | Failure and FaultDemonstrations / Research Papers / Ideas, Visions and Reflections / Journal First at Aurora B Chair(s): Lars Grunske Humboldt-Universität zu Berlin | ||
16:00 10mTalk | AgentFM: Role-Aware Failure Management for Distributed Databases with LLM-Driven Multi-Agents Ideas, Visions and Reflections Lingzhe Zhang Peking University, China, Yunpeng Zhai Alibaba Group, Tong Jia Institute for Artificial Intelligence, Peking University, Beijing, China, Xiaosong Huang Peking University, Chiming Duan Peking University, Ying Li School of Software and Microelectronics, Peking University, Beijing, China | ||
16:10 20mTalk | ReproCopilot: LLM-Driven Failure Reproduction with Dynamic Refinement Research Papers Tanakorn Leesatapornwongsa Microsoft Research, Fazle Faisal Microsoft Research, Suman Nath Microsoft Research DOI | ||
16:30 20mTalk | Improving Graph Learning-Based Fault Localization with Tailored Semi-Supervised Learning Research Papers Chun Li Nanjing University, Hui Li Samsung Electronics (China) R&D Centre, Zhong Li , Minxue Pan Nanjing University, Xuandong Li Nanjing University DOI | ||
16:50 20mTalk | Towards Understanding Docker Build Faults in Practice: Symptoms, Root Causes, and Fix Patterns Research Papers Yiwen Wu National University of Defense Technology, Yang Zhang National University of Defense Technology, China, Tao Wang National University of Defense Technology, Bo Ding National University of Defense Technology, Huaimin Wang DOI | ||
17:10 20mTalk | One Sentence Can Kill the Bug: Auto-replay Mobile App Crashes from One-sentence Overviews Journal First Yuchao Huang , Junjie Wang Institute of Software at Chinese Academy of Sciences, Zhe Liu Institute of Software, Chinese Academy of Sciences, Mingyang Li Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Song Wang York University, Chunyang Chen TU Munich, Yuanzhe Hu Institute of Software, Chinese Academy of Sciences, Qing Wang Institute of Software at Chinese Academy of Sciences | ||
17:30 10mTalk | Steering the Future: A Catalog of Failures in Deep Learning-Enabled Robotic Navigation Systems Demonstrations Meriel von Stein University of Virginia, Yili Bai University of Virginia, Trey Woodlief University of Virginia, United States, Sebastian Elbaum University of Virginia |
Aurora B is the second room in the Aurora wing.
When facing the main Cosmos Hall, access to the Aurora wing is on the right, close to the side entrance of the hotel.