Characterizing Logs in Vulnerability Reports: In-Depth Analysis and Security Implications
Software logs provide a rich source of data for tracing, debugging, and detecting software bugs. Despite their prevalence and usefulness in software development, the valuable data contained within logs attached to vulnerability reports remains largely unexplored. This study aims to bridge this gap by investigating the characteristics, rationales, and potential of logs for software vulnerability management. We conduct a comprehensive analysis of 1,118 Common Vulnerabilities and Exposures (CVEs) linked to issue reports, specifically focusing on the distribution and content of logs included in these reports. Our analysis reveal that exception logs are the most prevalent type. We identify three distinct categories of exception logs and examine their occurrence across various Common Weakness Enumeration (CWE) classifications. Additionally, we discover seven key rationales for attaching logs to vulnerability reports, highlighting the multifaceted role of logs in vulnerability reporting and analysis. Furthermore, we explore the feasibility of using logs to assist in vulnerability management, specifically for vulnerability location and security issue detection. Our experiment show that exception logs are effective in identifying vulnerable functions, successfully hitting at least one vulnerable function for 65.6% analyzed vulnerabilities. To support security issue detection, we apply three different approaches, i.e., heuristic rule-based, K-means++, and Latent Dirichlet Allocation. We evaluate the three approaches on a total of 158,730 issue reports from 72 projects hosted on GitHub and Bugzilla. The results show that heuristic rule-based and K-means++ approaches successfully identify true security issues, with a precision of 44.1% and 46.7% respectively. Additionally, we discover that four issues associated with vulnerabilities remain open and unfixed. While our findings demonstrate that logs can be beneficial in identifying suspicious vulnerabilities, there is significant potential for developing more sophisticated tools and techniques to leverage this information more effectively. The paper concludes with lessons learned and potential future work, emphasizing the importance of logs in enhancing software security practices and the need for continued research in vulnerability-associated log analysis for vulnerability management.