ISSTA 2025
Wed 25 - Sat 28 June 2025 Trondheim, Norway
co-located with FSE 2025
Wed 25 Jun 2025 14:25 - 14:50 at Aurora C - Runtime Analysis, Verification, and Slicing Chair(s): Heqing Huang

Logs generated by large-scale software systems contain a huge amount of useful information. As the first step of automated log analysis, log parsing has been extensively studied. General log parsing techniques focus on identifying static templates from raw logs, but overlook the more important semantics implied in dynamic log parameters. With the popularity of Artificial Intelligence for IT Operations (AIOps), traditional log parsing methods no longer meet the requirements of various downstream tasks. Researchers are now exploring the next generation of log parsing techniques, i.e., semantic log parsing, to identify both log templates and semantics in log parameters. However, the absence of semantic annotations in existing datasets hinders the training and evaluation of semantic log parsers, thereby stalling the progress of semantic log parsing.

To fill this gap and advance the field of semantic log parsing, we construct LogBase, the first semantic log parsing benchmark dataset. LogBase consists of logs from 130 popular open-source projects, containing 85,300 semantically annotated log templates, surpassing existing datasets in both log source diversity and template richness. To build Logbase, we develop the framework GenLog for constructing semantic log parsing datasets. GenLog mines log template-parameter-context triplets from popular open-source repositories on GitHub, and uses chain-of-thought (CoT) techniques with large language models (LLMs) to generate high-quality logs. Meanwhile, GenLog employs human feedback to improve the quality of the generated data and ensure its reliability. GenLog is highly automated and cost-effective, enabling researchers to easily and efficiently construct semantic log parsing datasets. Furthermore, we also design a set of comprehensive evaluation metrics for LogBase, including general log parser metrics and the metrics specifically for semantic log parsers and LLM-based parsers.

With LogBase, we extensively evaluate 15 existing log parsers, revealing their true performance in complex scenarios. We believe that this work provides researchers with valuable data, reliable tools, and insightful findings to support and guide the future research of semantic log parsing.

Wed 25 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:15
Runtime Analysis, Verification, and SlicingResearch Papers at Aurora C
Chair(s): Heqing Huang City University of Hong Kong
14:00
25m
Talk
Adding Spatial Memory Safety to EDK II through Checked C (Experience Paper)
Research Papers
Sourag Cherupattamoolayil Purdue University, Arunkumar Bhattar Purdue University, Connor Everett Glosner Purdue University, Aravind Machiry Purdue University
DOI
14:25
25m
Talk
LogBase: A Large-Scale Benchmark for Semantic Log Parsing
Research Papers
Chenbo Zhang Fudan University, Wenying Xu Fudan University, Jinbu Liu Alibaba, Lu Zhang Fudan University, Guiyang Liu Alibaba, Jihong Guan Tongji University, Qi Zhou Alibaba, Shuigeng Zhou Fudan University
DOI
14:50
25m
Talk
Static Program Reduction via Type-Directed Slicing
Research Papers
Loi Ngo Duc Nguyen University of California, Riverside, Tahiatul Islam New Jersey Institute of Technology, Theron Wang The Academy for Mathematics, Science & Engineering, USA, Sam Lenz New Jersey Institute of Technology, Martin Kellogg New Jersey Institute of Technology
DOI Pre-print

Information for Participants
Wed 25 Jun 2025 14:00 - 15:15 at Aurora C - Runtime Analysis, Verification, and Slicing Chair(s): Heqing Huang
Info for room Aurora C:

Aurora C is the third room in the Aurora wing.

When facing the main Cosmos Hall, access to the Aurora wing is on the right, close to the side entrance of the hotel.

:
:
:
: