Write a Blog >>
ISSTA 2021
Sun 11 - Sat 17 July 2021 Online
co-located with ECOOP and ISSTA 2021
Sat 17 Jul 2021 01:10 - 01:30 at ISSTA 2 - Session 22 (time band 2) Bugs and Analysis 1 Chair(s): Saeid Tizpaz-Niari
Sat 17 Jul 2021 09:30 - 09:50 at ISSTA 1 - Session 27 (time band 3) Bugs and Analysis 2 Chair(s): Mike Papadakis

With the widespread use of cloud-native architecture, increasing web applications (apps) choose to build on microservices. Simultaneously, troubleshooting becomes full of challenges owing to the high dynamics and complexity of anomaly propagation. Existing diagnostic methods rely heavily on monitoring metrics collected from the kernel side of microservice systems. Without a comprehensive monitoring infrastructure, application owners and even cloud operators cannot resort to these kernel-space solutions.
This paper summarizes several insights on operating a top commercial cloud platform. Then, for the first time, we put forward the idea of user-space diagnosis for microservice kernel failures.
To this end, we develop a crowdsourcing solution - DyCause, to resolve the asymmetric diagnostic information problem.
DyCause deploys on the application side in a distributed manner. Through lightweight API log sharing, apps collect the operational status of kernel services collaboratively and initiate diagnosis on demand. Deploying DyCause is fast and lightweight as we do not have any architectural and functional requirements for the kernel.
To reveal more accurate correlations from asymmetric diagnostic information, we design a novel statistical algorithm that can efficiently discover the time-varying causalities between services. This algorithm also helps us build the temporal order of the anomaly propagation. Therefore, by using DyCause, we can obtain more in-depth and interpretable diagnostic clues with limited indicators.
We apply and evaluate DyCause on both a simulated test-bed and a real-world cloud system. Experimental results verify that DyCause running in the user-space outperforms several state-of-the-art algorithms running in the kernel on accuracy. Besides, DyCause shows superior advantages in terms of algorithmic efficiency and data sensitivity. Simply put, DyCause produces a significantly better result than other baselines when analyzing much fewer or sparser metrics. To conclude, DyCause is faster to act, deeper in analysis, and easier to deploy.

DyCause Slides (ISSTA21-DyCause-Slides.pdf)2.65MiB

Sat 17 Jul

Displayed time zone: Brussels, Copenhagen, Madrid, Paris change

01:10 - 02:30
Session 22 (time band 2) Bugs and Analysis 1 Technical Papers at ISSTA 2
Chair(s): Saeid Tizpaz-Niari University of Texas at El Paso
01:10
20m
Talk
Faster, Deeper, Easier: Crowdsourcing Diagnosis of Microservice Kernel Failure from User Space
Technical Papers
Yicheng Pan Peking University, Meng Ma Peking University, Xinrui Jiang Peking University, Ping Wang Peking University
DOI Media Attached File Attached
01:30
20m
Talk
iDEV: Exploring and Exploiting Semantic Deviations in ARM Instruction Processing
Technical Papers
Shisong Qin Tsinghua University, Chao Zhang Tsinghua University, Kaixiang Chen Tsinghua University, Zheming Li Tsinghua University
DOI
01:50
20m
Talk
RAProducer: Efficiently Diagnose and Reproduce Data Race Bugs for Binaries via Trace Analysis
Technical Papers
Ming Yuan Tsinghua University, Yeseop Lee Tsinghua University, Chao Zhang Tsinghua University, Yun Li Tsinghua University, Yan Cai Institute of Software at Chinese Academy of Sciences, Bodong Zhao Tsinghua University
DOI
02:10
20m
Talk
Fixing Dependency Errors for Python Build Reproducibility
Technical Papers
Suchita Mukherjee University of California at Davis, Abigail Almanza University of California at Davis, Cindy Rubio-González University of California at Davis
DOI
09:30 - 11:10
Session 27 (time band 3) Bugs and Analysis 2Technical Papers at ISSTA 1
Chair(s): Mike Papadakis University of Luxembourg, Luxembourg
09:30
20m
Talk
Faster, Deeper, Easier: Crowdsourcing Diagnosis of Microservice Kernel Failure from User Space
Technical Papers
Yicheng Pan Peking University, Meng Ma Peking University, Xinrui Jiang Peking University, Ping Wang Peking University
DOI Media Attached File Attached
09:50
20m
Talk
Finding Data Compatibility Bugs with JSON Subschema CheckingDistinguished Artifact
Technical Papers
Andrew Habib SnT, University of Luxembourg, Avraham Shinnar IBM Research, Martin Hirzel IBM Research, Michael Pradel University of Stuttgart
Link to publication DOI Pre-print File Attached
10:10
20m
Talk
Semantic Table Structure Identification in Spreadsheets
Technical Papers
Yakun Zhang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Xiao Lv Microsoft Research, Haoyu Dong Microsoft Research, Wensheng Dou Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Shi Han Microsoft Research, Dongmei Zhang Microsoft Research, Jun Wei Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Dan Ye Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences
DOI Media Attached
10:30
20m
Talk
Deep Just-in-Time Defect Prediction: How Far Are We?
Technical Papers
Zhengran Zeng Southern University of Science and Technology, Yuqun Zhang Southern University of Science and Technology, Haotian Zhang Kwai, Lingming Zhang University of Illinois at Urbana-Champaign
DOI
10:50
20m
Talk
Continuous Test Suite Failure Prediction
Technical Papers
Cong Pan Beihang University, Michael Pradel University of Stuttgart
DOI Media Attached