Write a Blog >>
ICSE 2021
Mon 17 May - Sat 5 June 2021

Cloud computing is ubiquitous: more and more companies are moving the workloads into the Cloud. However, this rise in popularity challenges Cloud service providers, as they need to monitor the quality of their ever-growing offerings effectively. To address the challenge, we designed and implemented an automated monitoring system for the IBM Cloud Platform. This monitoring system utilizes deep learning neural networks to detect anomalies in near-real-time in multiple Platform components simultaneously.

After running the system for a year, we observed that the proposed solution frees the DevOps team’s time and human resources from manually monitoring thousands of Cloud components. Moreover, it increases customer satisfaction by reducing the risk of Cloud outages.

In this paper, we share our solutions’ architecture, implementation notes, and best practices that emerged while evolving the monitoring system. They can be leveraged by other researchers and practitioners to build anomaly detectors for complex systems.

Wed 26 May

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

18:50 - 19:50
2.5.4. Some Big Companies' Practices: Cases at Facebook, Google & IBMSEIP - Software Engineering in Practice at Blended Sessions Room 4 +12h
Chair(s): Davide Falessi California Polytechnic State University
18:50
20m
Paper
Testing Web Enabled Simulation at Scale Using Metamorphic TestingSEIP
SEIP - Software Engineering in Practice
Mark Harman Facebook, Inc., John Ahlgren Facebook, Maria Eugenia Berezin Facebook, Elena Dulskyte Facebook, Inna Dvortsova Facebook, Johann George Facebook, Natalija Gucevska Facebook, Erik Meijer , Justin Spahr-Summers Facebook, Kinga Bojarczuk Facebook, Silvia Sapora Facebook, Maria Lomeli Facebook
Pre-print Media Attached
19:10
20m
Paper
Anomaly Detection in a Large-scale Cloud PlatformSEIP
SEIP - Software Engineering in Practice
Mohammad Saiful Islam Ryerson University, William Pourmajidi Ryerson University, Lei Zhang Ryerson University, John Steinbacher IBM, Tony Erwin IBM, Andriy Miranskyy Ryerson University
Pre-print Media Attached
19:30
20m
Paper
Smart Build Targets Batching Service at GoogleSEIP
SEIP - Software Engineering in Practice
Kaiyuan Wang Google, USA, Daniel Rall Google, Greg Tener Google, Vijay Gullapalli Google, Xin Huang , Ahmed Gad Google
Pre-print Media Attached

Thu 27 May

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

06:50 - 07:50
2.5.4. Some Big Companies' Practices: Cases at Facebook, Google & IBMSEIP - Software Engineering in Practice at Blended Sessions Room 4
06:50
20m
Paper
Testing Web Enabled Simulation at Scale Using Metamorphic TestingSEIP
SEIP - Software Engineering in Practice
Mark Harman Facebook, Inc., John Ahlgren Facebook, Maria Eugenia Berezin Facebook, Elena Dulskyte Facebook, Inna Dvortsova Facebook, Johann George Facebook, Natalija Gucevska Facebook, Erik Meijer , Justin Spahr-Summers Facebook, Kinga Bojarczuk Facebook, Silvia Sapora Facebook, Maria Lomeli Facebook
Pre-print Media Attached
07:10
20m
Paper
Anomaly Detection in a Large-scale Cloud PlatformSEIP
SEIP - Software Engineering in Practice
Mohammad Saiful Islam Ryerson University, William Pourmajidi Ryerson University, Lei Zhang Ryerson University, John Steinbacher IBM, Tony Erwin IBM, Andriy Miranskyy Ryerson University
Pre-print Media Attached
07:30
20m
Paper
Smart Build Targets Batching Service at GoogleSEIP
SEIP - Software Engineering in Practice
Kaiyuan Wang Google, USA, Daniel Rall Google, Greg Tener Google, Vijay Gullapalli Google, Xin Huang , Ahmed Gad Google
Pre-print Media Attached