Improving LLM-based Log Parsing by Learning from Errors in Reasoning Traces
Recent advances in reasoning-capable large language models (LLMs) have led to their application in a wide range of tasks, including log parsing. These LLMs generate intermediate reasoning traces during inference, offering a unique opportunity to analyze and improve their performance. In this work, we investigate how reasoning traces can be leveraged to enhance LLM-based log parsers. We propose TraceDoctor, a framework that analyzes reasoning traces associated with parsing errors to understand the causes of failure. We categorize these error causes into high-level error types and design targeted log variant generation strategies guided by these high-level error types. The generated variants are then used to fine-tune the LLMs. We instantiate five state-of-the-art (SOTA) reasoning-capable LLMs as log parsers and identify 29 distinct high-level error types. Our approach improves their average parsing accuracy by up to 17.3% and 16.3% on parsing accuracy (PA) and group accuracy (GA), respectively.