Defects4Log: Benchmarking LLMs for Logging Code Defect Detection and Reasoning
This program is tentative and subject to change.
Logging code is written by developers to capture system runtime behavior and plays a vital role in debugging, performance analysis, and system monitoring. However, defects in logging code can undermine the usefulness of logs and lead to misinterpretations. Although prior work has identified several logging defect patterns and provided valuable insights into logging practices, these studies often focus on a narrow range of defect patterns derived from limited sources (e.g., commit histories) and lack a systematic and comprehensive analysis. Moreover, large language models (LLMs) have demonstrated promising generalization and reasoning capabilities across a variety of coderelated tasks, yet their potential for detecting logging code defects remains largely unexplored.
In this experience paper, we derive a comprehensive taxonomy of logging code defects, which encompasses seven logging code defect patterns with 14 detailed scenarios. We further construct a benchmark dataset, Defects4Log, consisting of 164 developer verified real-world logging defects. Then we propose an automated framework that leverages various prompting strategies and contextual information to evaluate LLMs’ capability in detecting and reasoning logging code defects. Experimental results reveal that LLMs generally struggle to accurately detect and reason logging code defects based on the source code only. However, incorporating proper knowledge (e.g., detailed scenarios of defect patterns) can lead to 10.9% improvement in detection accuracy. Overall, our findings provide actionable guidance for practitioners to avoid common defect patterns and establish a foundation for improving LLM-based reasoning in logging code defect detection.
This program is tentative and subject to change.
Mon 17 NovDisplayed time zone: Seoul change
11:00 - 12:40 | |||
11:00 10mTalk | LogMoE: Lightweight Expert Mixture for Cross-System Log Anomaly Detection Research Papers Jiaxing Qi Beihang University, Zhongzhi Luan Beihang University, Shaohan Huang Beihang University, Carol Fung Concordia University, Yuchen Wang Beihang University, Aibin Wang Beihang University, Hongyu Zhang Chongqing University, Hailong Yang Beihang University, China, Depei Qian Beihang University, China | ||
11:10 10mTalk | Improving LLM-based Log Parsing by Learning from Errors in Reasoning Traces Research Papers Wang Jialai National University of Singapore, Juncheng Lu Southeast University, Jie Yang Wuhan University, Junjie Wang Institute of Software at Chinese Academy of Sciences, Zeyu Gao Tsinghua University, Chao Zhang Tsinghua University, Zhenkai Liang NUS, Ee-Chien Chang School of Computing, NUS | ||
11:20 10mTalk | LogUpdater: Automated Detection and Repair of Specific Defects in Logging Statements Journal-First Track Renyi Zhong The Chinese University of Hong Kong, Yichen LI ByteDance, Jinxi Kuang The Chinese University of Hong Kong, Wenwei Gu The Chinese University of Hong Kong, Yintong Huo Singapore Management University, Singapore, Michael Lyu The Chinese University of Hong Kong | ||
11:30 10mTalk | LogAction: Consistent Cross-system Anomaly Detection through Logs via Active Domain Adaptation Research Papers Chiming Duan Peking University, Minghua He Peking University, Pei Xiao Peking University, Tong Jia Institute for Artificial Intelligence, Peking University, Beijing, China, Xin Zhang Peking University, Zhewei Zhong Bytedance, Xiang Luo Bytedance, Yan Niu Bytedance, Lingzhe Zhang Peking University, China, Yifan Wu Peking University, Siyu Yu The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), Weijie Hong Peking university, Ying Li School of Software and Microelectronics, Peking University, Beijing, China, Gang Huang Peking University | ||
11:40 10mTalk | Diplomatist: What Do Cross-language Dependencies Reflect Software Ecosystem Health? Research Papers Fanyi Meng Shenyang University of Technology, Ying Wang Northeastern University, Chun Yong Chong Monash University Malaysia, Hai Yu Northeastern University, China, Zhiliang Zhu Northeastern University, China | ||
11:50 10mTalk | Defects4Log: Benchmarking LLMs for Logging Code Defect Detection and Reasoning Research Papers Xin Wang Changsha University of Science and Technology, Zhenhao Li York University, Zishuo Ding The Hong Kong University of Science and Technology (Guangzhou) | ||
12:00 10mTalk | Which Is Better For Reducing Outdated And Vulnerable Dependencies: Pinning Or Floating? Research Papers Imranur Rahman North Carolina State University, Jill Marley North Carolina State University, William Enck North Carolina State University, Laurie Williams North Carolina State University | ||
12:10 10mTalk | On Automating Configuration Dependency Validation via Retrieval-Augmented Generation Research Papers Sebastian Simon Leipzig University, Alina Mailach Leipzig University, Johannes Dorn Leipzig University, Norbert Siegmund Leipzig University Pre-print | ||
12:20 10mTalk | CollaborLog: Efficient-Generalizable Log Anomaly Detection via Large-Small Model Collaboration in Software Evolution Research Papers Pei Xiao Peking University, Chiming Duan Peking University, Minghua He Peking University, Tong Jia Institute for Artificial Intelligence, Peking University, Beijing, China, Yifan Wu Peking University, Jing Xu ByteDance, Gege Gao ByteDance, Lingzhe Zhang Peking University, China, Weijie Hong Peking university, Ying Li School of Software and Microelectronics, Peking University, Beijing, China, Gang Huang Peking University | ||
12:30 10mTalk | On the Robustness Evaluation of 3D Obstacle Detection Against Specifications in Autonomous Driving Research Papers Tri Minh-Triet Pham Concordia University, Bo Yang Concordia University, Jinqiu Yang Concordia University |