CloudIntelligence 2021
Sat 29 May 2021
co-located with ICSE 2021
Sat 29 May 2021 16:55 - 17:07 at CloudIntelligence Room - Project Showcase Session Chair(s): Yingnong Dang

In recent years, the development of cloud systems (e.g., Microsoft Azure) has grown explosively, and a variety of software services have been deployed on cloud systems. As cloud systems are required to serve customers on a 24/7 basis, high service reliability is essential to them. To reduce the number of the faults in cloud systems, many machine learning based fault forecasting methods have been proposed. Those forecasting methods aim to predict faults in advance so that proactive actions can be taken to avoid negative impact, and they mainly focus on a specific hardware (e.g., disk, memory and node). In cloud systems, many fault forecasting tasks have similar characteristics: 1) they are based on the temporal monitoring data and 2) they usually suffer from similar challenges (e.g., the extreme data imbalance problem). In this work, we present a unified fault forecasting framework for cloud systems, dubbed F3. In particular, F3 introduces an end-to-end pipeline for a variety of fault forecasting tasks in cloud systems, and the pipeline underlying F3 consists of several critical parts (e.g., data processing, fault forecasting, prediction result interpretation and action decision). In this way, when a new fault forecasting task arrives, F3 can be easily and effectively utilized to handle the new task with adaption. Besides, F3 is able to overcome other challenges, including the extreme data imbalance problem, data inconsistency between online and offline environments, as well as model overfitting. More encouragingly, F3 has been successfully applied to Microsoft Azure and has helped significantly reduce the number of virtual machine interruptions.

Sat 29 May
Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

16:30 - 17:20
Project Showcase SessionCloudIntelligence 2021 at CloudIntelligence Room
Chair(s): Yingnong DangMicrosoft, USA
16:30
12m
Demonstration
Building a Secured Data Intelligence Platform
CloudIntelligence 2021
Conan  YangSalesforce
16:42
12m
Demonstration
Infusing ML into VM Provisioning in Cloud
CloudIntelligence 2021
Chuan LuoMicrosoft Research, China, Randolph YaoMicrosoft, USA, Bo QiaoMicrosoft Research, Beijing, China, Qingwei LinMicrosoft Research, Beijing, China, Tri M. TranMicrosoft Azure, Gil  Shafriri Microsoft Azure, Yingnong DangMicrosoft, USA, Raphael  Ghelman Microsoft Azure, Pulak  Goyal Microsoft Azure, Eli CortezMicrosoft Azure, Daud  Howlader Microsoft Azure, Sushant  Rewaskar Microsoft Azure, Murali ChintalapatiMicrosoft Azure, Dongmei ZhangMicrosoft Research
16:55
12m
Demonstration
F3: Fault Forecasting Framework for Cloud Systems
CloudIntelligence 2021
Chuan LuoMicrosoft Research, China, Pu ZhaoMicrosoft Research, Beijing, China, Bo QiaoMicrosoft Research, Beijing, China, Youjiang WuMicrosoft, USA, Yingnong DangMicrosoft, USA, Murali ChintalapatiMicrosoft Azure, Susy  YiMicrosoft 365, Paul WangMicrosoft 365, Andrew  ZhouMicrosoft 365, Saravanakumar RajmohanMicrosoft Office, United States, Qingwei LinMicrosoft Research, Beijing, China, Dongmei ZhangMicrosoft Research
17:07
12m
Demonstration
SEAT: statistically sound infra-side deployment and integration testing
CloudIntelligence 2021
Nutcha  TemiyasathitFacebook, Tao YangFacebook, Karan LuthraFacebook, Nick RuffFacebook, Petar ZuljevicFacebook, Ethan BenowitzFacebook, Boris BaracaldoFacebook, Oytun EskiyenenturkFacebook, Xin FuFacebook

Information for Participants
Info for CloudIntelligence Room: