CloudIntelligence 2021
Sat 29 May 2021
co-located with ICSE 2021
Sat 29 May 2021 12:25 - 12:40 at CloudIntelligence Room - Technical paper session #1 Chair(s): Qingwei Lin

Large-scale could systems such as Microsoft Azure, Google Cloud and Amazon AWS provide a wide variety of online services which serve millions of customers around the world. Adverse service behaviors and latencies can have a huge performance impact which affect user satisfaction. Besides monitoring system KPI metrics and log, trace data is of great value for analyzing the system performance status, detecting anomalous workstreams and localizing the performance bottleneck. However, existing work mostly represent the trace as a sequence of events with execution time information, which ignores the runtime context and graph structure of the trace. In this paper, we propose a trace representation and learning model, TraceLingo, which adopts a tree-based RNN model to capture the dependency between spans in various traces for automatic and effective performance diagnosis.

Sat 29 May
Times are displayed in time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:55 - 12:55
Technical paper session #1CloudIntelligence 2021 at CloudIntelligence Room
Chair(s): Qingwei LinMicrosoft Research, Beijing, China
11:55
15m
Paper
PerfEstimator: A Generic and Extensible Performance Estimator for Data Parallel DNN Training
CloudIntelligence 2021
Chengru  YangUniversity of Science and Technology of China, Zhehao  LiUniversity of Science and Technology of China, Chaoyi  RuanUniversity of Science and Technology of China, Guanbin  XuUniversity of Science and Technology of China, Cheng  LiUniversity of Science and Technology of China, Ruichuan  ChenNokia Bell Labs, Feng YanUniversity of Nevada Reno
12:10
15m
Paper
Kmon: An In-kernel Transparent Monitoring System for Microservice Systems with eBPF
CloudIntelligence 2021
Tianjun WengSun Yat-Sen University, Wanqi  YangSun Yat-Sen University, Guangba  YuSun Yat-Sen University, Pengfei ChenSun Yat-Sen University, Jieqi CuiSun Yat-Sen University, Chuanfu  ZhangSun Yat-Sen University
12:25
15m
Paper
TraceLingo: Trace representation and learning for performance issue diagnosis in cloud services
CloudIntelligence 2021
Yong XuMicrosoft, China, Yaokang  ZhuMicrosoft Research Asia, Bo QiaoMicrosoft Research, Beijing, China, Hongshu  CheMicrosoft Research, Beijing, China, Pu ZhaoMicrosoft Research, Beijing, China, Xu ZhangMicrosoft Research, Beijing, China, Ze LiMicrosoft, USA, Yingnong DangMicrosoft, USA, Qingwei LinMicrosoft Research, Beijing, China
12:40
15m
Paper
MicroDiag: Fine-grained Performance Diagnosis for Microservice Systems
CloudIntelligence 2021
Li WuElastisys AB/Technische Universität Berlin, Johan TordssonElastisys AB, Jasmin Bogatinovski, Erik  Elmroth Elastisys AB/Umea University, Odej  KaoTechnische Universität Berlin

Information for Participants
Info for CloudIntelligence Room: