Large-Scale Trace Analysis for Microservice Architecture Understanding and Fault Analysis
Operation engineers and developers highly rely on trace analysis to understand architectures and diagnose various problems such as service failures and quality degradation. However, the huge number of traces produced at runtime makes it challenging to capture the required information in real-time. In this talk, I will present two recent works collaborating with industrial partners, GMTA and MicroHECL, on large-scale trace analysis for microservice architecture understanding and fault analysis. Built on a graph-based representation, GMTA abstracts traces into different paths and further groups them into business flows and supports various analytical applications based on an efficient storage and access mechanism. MicroHECL is a high-efficient root cause localization approach for availability issues of microservice systems. It analyzes possible anomaly propagation chains, and ranks candidate root causes based on correlation analysis. Both the two works have been applied in the production systems of our industrial partners.
Xin Peng received the bachelor’s and PhD degrees in computer science from Fudan University, in 2001 and 2006, respectively. He is a professor of the School of Computer Science, Fudan University, China. His research interests include data-driven intelligent software development, cloud-native software and AIOps, software engineering for AI and cyber-physical-social Systems. His work won the ICSM 2011 Best Paper Award, the ACM SIGSOFT Distinguished Paper Award at ASE 2018, the IEEE TCSE Distinguished Paper Awards at ICSME 2018/2019/2020, and the IEEE Transactions on Software Engineering 2018 Best Paper Award. He was a steering committee member of International Conference on Software Maintenance and Evolution (ICSME) during 2017-2020. Now he is a co-editor of Journal of Software: Evolution and Process (JSEP), an editorial board member of ACM Transactions on Software Engineering and Methodology (TOSEM), Empirical Software Engineering (EMSE), and Chinese Journal of Software.
Sat 29 MayDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
13:55 - 14:15
Invited Talk #2CloudIntelligence 2021 at CloudIntelligence Room
Chair(s): Qingwei Lin Microsoft Research, Beijing, China
|Large-Scale Trace Analysis for Microservice Architecture Understanding and Fault Analysis|
Xin Peng Fudan University, China
Go directly to this room on Clowdr