CloudIntelligence 2021
Sat 29 May 2021
co-located with ICSE 2021
Sat 29 May 2021 16:55 - 17:07 at CloudIntelligence Room - Project Showcase Session Chair(s): Yingnong Dang

In recent years, the development of cloud systems (e.g., Microsoft Azure) has grown explosively, and a variety of software services have been deployed on cloud systems. As cloud systems are required to serve customers on a 24/7 basis, high service reliability is essential to them. To reduce the number of the faults in cloud systems, many machine learning based fault forecasting methods have been proposed. Those forecasting methods aim to predict faults in advance so that proactive actions can be taken to avoid negative impact, and they mainly focus on a specific hardware (e.g., disk, memory and node). In cloud systems, many fault forecasting tasks have similar characteristics: 1) they are based on the temporal monitoring data and 2) they usually suffer from similar challenges (e.g., the extreme data imbalance problem). In this work, we present a unified fault forecasting framework for cloud systems, dubbed F3. In particular, F3 introduces an end-to-end pipeline for a variety of fault forecasting tasks in cloud systems, and the pipeline underlying F3 consists of several critical parts (e.g., data processing, fault forecasting, prediction result interpretation and action decision). In this way, when a new fault forecasting task arrives, F3 can be easily and effectively utilized to handle the new task with adaption. Besides, F3 is able to overcome other challenges, including the extreme data imbalance problem, data inconsistency between online and offline environments, as well as model overfitting. More encouragingly, F3 has been successfully applied to Microsoft Azure and has helped significantly reduce the number of virtual machine interruptions.

Sat 29 May

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

16:30 - 17:20
Project Showcase SessionCloudIntelligence 2021 at CloudIntelligence Room
Chair(s): Yingnong Dang Microsoft, USA
16:30
12m
Demonstration
Building a Secured Data Intelligence Platform
CloudIntelligence 2021
Conan  Yang Salesforce
16:42
12m
Demonstration
Infusing ML into VM Provisioning in Cloud
CloudIntelligence 2021
Chuan Luo Microsoft Research, China, Randolph Yao Microsoft, USA, Bo Qiao Microsoft Research, Beijing, China, Qingwei Lin Microsoft Research, Beijing, China, Tri M. Tran Microsoft Azure, Gil  Shafriri  Microsoft Azure, Yingnong Dang Microsoft, USA, Raphael  Ghelman  Microsoft Azure, Pulak  Goyal  Microsoft Azure, Eli Cortez Microsoft Azure, Daud  Howlader  Microsoft Azure, Sushant  Rewaskar  Microsoft Azure, Murali Chintalapati Microsoft Azure, Dongmei Zhang Microsoft Research
16:55
12m
Demonstration
F3: Fault Forecasting Framework for Cloud Systems
CloudIntelligence 2021
Chuan Luo Microsoft Research, China, Pu Zhao Microsoft Research, Beijing, China, Bo Qiao Microsoft Research, Beijing, China, Youjiang Wu Microsoft, USA, Yingnong Dang Microsoft, USA, Murali Chintalapati Microsoft Azure, Susy  Yi Microsoft 365, Paul Wang Microsoft 365, Andrew Zhou Microsoft 365, Saravanakumar Rajmohan Microsoft Office, United States, Qingwei Lin Microsoft Research, Beijing, China, Dongmei Zhang Microsoft Research
17:07
12m
Demonstration
SEAT: statistically sound infra-side deployment and integration testing
CloudIntelligence 2021
Nutcha  Temiyasathit Facebook, Tao Yang Facebook, Karan Luthra Facebook, Nick Ruff Facebook, Petar Zuljevic Facebook, Ethan Benowitz Facebook, Boris Baracaldo Facebook, Oytun Eskiyenenturk Facebook, Xin Fu Facebook

Information for Participants
Sat 29 May 2021 16:30 - 17:20 at CloudIntelligence Room - Project Showcase Session Chair(s): Yingnong Dang
Info for room CloudIntelligence Room:

Go directly to this room on Clowdr