Maat: Performance Metric Anomaly Anticipation for Cloud Services with Conditional Diffusion
Ensuring the reliability and user satisfaction of cloud services necessitates prompt anomaly detection followed by diagnosis. Existing techniques for anomaly detection focus solely on real-time detection, meaning that anomaly alerts are issued as soon as anomalies occur. However, anomalies can propagate and escalate into failures, making faster-than-real-time anomaly detection highly desirable for expediting downstream analysis and intervention. This paper proposes Maat, the first work to address anomaly anticipation of performance metrics in cloud services. Maat adopts a novel two-stage paradigm for anomaly anticipation, consisting of metric forecasting and anomaly detection on forecasts. The metric forecasting stage employs a conditional denoising diffusion model to enable multi-step forecasting in an auto-regressive manner. The detection stage extracts anomaly-indicating features based on domain knowledge and applies isolation forest with incremental learning to detect upcoming anomalies. Thus, our method can uncover anomalies that better conform to human expertise. Evaluation on three publicly available datasets demonstrates that Maat can anticipate anomalies faster than real-time comparatively or more effectively compared with state-of-the-art real-time anomaly detectors. We also present cases highlighting Maat’s success in forecasting abnormal metrics and discovering anomalies.
Tue 12 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:30 - 12:00 | |||
10:30 12mTalk | Twin Graph-based Anomaly Detection via Attentive Multi-Modal Learning for Microservice System Research Papers Jun Huang Anhui University of Technology, Yang Yang Anhui University of Technology, Hang Yu Ant Group, Jianguo Li Ant Group, Xiao Zheng Anhui University of Technology | ||
10:42 12mTalk | Dynamic Graph Neural Networks-based Alert Link Prediction for Online Service Systems Research Papers Yiru Chen Fudan University, Chenxi Zhang Fudan University, Zhen Dong Fudan University, China, Dingyu Yang Alibaba Group, Xin Peng Fudan University, Jiayu Ou Alibaba Group, Hong Yang Fudan University, Zheshun Wu Alibaba Group, Xiaojun Qu Alibaba Group, Wei Li Alibaba Group | ||
10:54 12mTalk | A Model-based Mode-Switching-Framework based on Security Vulnerability Scores Journal-first Papers Michael Riegler Johannes Kepler University Linz, Johannes Sametinger Johannes Kepler University Linz, Michael Vierhauser University of Innsbruck, Manuel Wimmer JKU Linz Link to publication DOI File Attached | ||
11:06 12mTalk | Maat: Performance Metric Anomaly Anticipation for Cloud Services with Conditional Diffusion Research Papers Cheryl Lee The Chinese University of Hong Kong, Tianyi Yang The Chinese University of Hong Kong, Zhuangbin Chen School of Software Engineering, Sun Yat-sen University, Yuxin Su Sun Yat-sen University, Michael Lyu The Chinese University of Hong Kong Pre-print | ||
11:18 12mTalk | Vicious Cycles in Distributed Software SystemsRecorded talk Research Papers Shangshu Qian Purdue University, Wen Fan Purdue University, Lin Tan Purdue University, Yongle Zhang Purdue University Pre-print Media Attached | ||
11:30 12mTalk | Scene-Driven Exploration and GUI Modeling for Android AppsRecorded talk Research Papers Xiangyu Zhang , Lingling Fan Nankai University, Sen Chen Tianjin University, Yucheng Su Alibaba Group, Boyuan Li Nankai University Media Attached |