SANER 2026
Tue 17 - Fri 20 March 2026 Limassol, Cyprus

Log parsing is a critical stage in log analysis: converting log messages into structured event templates allows for automated log analysis and reduces the effort in manual inspection. To select the most compatible parser for a specific system, multiple evaluation metrics are commonly used for performance comparisons. However, we noticed that the existing log parser’s evaluation metrics heavily rely on labeled log data, which limits prior studies to a fixed set of datasets and hinders parser evaluations and selections in the industry. Further, we discovered that different versions of ground-truth used in existing studies can lead to inconsistent performance conclusions. Motivated by these challenges, we propose a novel label-free template-level metric, PMSS (parser medoid silhouette score), to evaluate log parser performance. PMSS evaluates both parser grouping and template quality with medoid silhouette analysis and Levenshtein distance within a linear time complexity in general. To understand its relationship with label-based template-level metrics (i.e., FGA and FTA), we compared their evaluation outcomes for six log parsers on the standard corrected Loghub 2.0 dataset. Our results indicate that log parsers achieving the highest PMSS or FGA exhibit comparable performance, differing by only 2.1% on average in terms of the FGA score; the difference is 9.8% for FTA. PMSS is also significantly ($p<1e^{-8}$) and positively correlated to both FGA and FTA: the Spearman’s $\rho$ correlation coefficient of PMSS-FGA and PMSS-FTA are respectively 0.648 and 0.587. Based on the experiments, we extended our discussion on how to interpret the conclusions and provided guidelines on conducting parser selections with our metric. Our label-free metric provides a valuable evaluation alternative when ground-truths are inconsistent or no labeled data is available.

Thu 19 Mar

Displayed time zone: Athens change

11:00 - 12:30
Session 4C - Log Analysis, Observability, and Software BehaviorTool Demo Track / Journal First Track / Research Track / Industrial Track / Short Papers and Posters Track / Registered Report Track at Megaron Gamma
Chair(s): Alexander Berndt Heidelberg University
11:00
15m
Talk
A Story About Cohesion and Separation: Unsupervised Metric for Log Parser Evaluation
Research Track
Qiaolin Qin Polytechnique Montréal, Jianchen Zhao University of Waterloo, Heng Li Polytechnique Montréal, Weiyi Shang University of Waterloo, Ettore Merlo Polytechnique Montreal
Pre-print
11:15
15m
Talk
Impact of log parsing on deep learning-based anomaly detection
Journal First Track
Zanis Ali Khan Luxembourg Institute of Science and Technology, Donghwan Shin University of Sheffield, Domenico Bianculli University of Luxembourg, Lionel Briand University of Ottawa, Canada; Lero centre, University of Limerick, Ireland
11:30
15m
Talk
Empirical Characterization of Logging Smells in Machine Learning Code.
Registered Report Track
Foalem Patrick Loic Polytechnique Montréal, Leuson Da Silva Polytechnique Montreal, Foutse Khomh Polytechnique Montréal, Heng Li Polytechnique Montréal
11:45
15m
Talk
Extracting Causal Relations from Log Sequences Using Causal Language Models
Industrial Track
12:00
7m
Talk
VisualLogAnalyzer: An Interactive Web Application for Multi-Level Log Analysis
Tool Demo Track
Jesse Nyyssölä University of Helsinki, Simo Sipilä , Mika Mäntylä University of Helsinki and University of Oulu
12:07
7m
Talk
DumpSuite: A Web-Based Platform for Core Dump Management and Analysis
Tool Demo Track
12:14
7m
Talk
Towards Observation Lakehouses: Living, Interactive Archives of Software Behavior
Tool Demo Track
12:21
7m
Talk
A Lightweight Visual Query System for Resource-Constrained Windows Log Analysis
Short Papers and Posters Track
Feifan Lu University of Glasgow, Burak Kizilkaya University of Glasgow