ICSE 2024
Fri 12 - Sun 21 April 2024 Lisbon, Portugal
Wed 17 Apr 2024 16:00 - 16:15 at Sophia de Mello Breyner Andresen - Analytics 2 Chair(s): Grace Lewis

Log data is a crucial resource for recording system events and states during system execution. However, as systems grow in scale, log data generation has become increasingly explosive, leading to an expensive overhead on log storage, such as several petabytes per day in production. To address this issue, log compression has become a crucial task in reducing disk storage while allowing for further log analysis. Unfortunately, existing general-purpose and log-specific compression methods have been limited in their ability to utilize log data characteristics. To overcome these limitations, we conduct an empirical study and identify three major observations on the characteristics of log data that can facilitate the log compression task. Based on these observations, we propose LogShrink, a novel and effective log compression method by leveraging commonality and variability of log data. An analyzer based on Longest Common Subsequence and entropy techniques is proposed to identify the latent commonality and variability in log messages. The key idea behind this is that the commonality and variability can be exploited to shrink log data with a shorter representation. Besides, a clustering-based sequence sampler is introduced to accelerate the commonality and variability analyzer. The extensive experimental results demonstrate that LogShrink can exceed baselines in compression ratio by 16% to 356% on average while preserving a reasonable compression speed.

Wed 17 Apr

Displayed time zone: Lisbon change

16:00 - 17:30
Analytics 2Research Track / Journal-first Papers / Demonstrations at Sophia de Mello Breyner Andresen
Chair(s): Grace Lewis Carnegie Mellon Software Engineering Institute
16:00
15m
Talk
LogShrink: Effective Log Compression by Leveraging Commonality and Variability of Log Data
Research Track
Xiaoyun Li Sun Yat-sen University, Hongyu Zhang Chongqing University, Van-Hoang Le The University of Newcastle, Pengfei Chen Sun Yat-sen University
Pre-print
16:15
15m
Talk
Demystifying Compiler Unstable Feature Usage and Impacts in the Rust Ecosystem
Research Track
Chenghao Li Zhejiang University, Yifei Wu Zhejiang University, Wenbo Shen Zhejiang University, China, Zichen Zhao Zhejiang University, Rui Chang Zhejiang University, Chengwei Liu Nanyang Technological University, Yang Liu Nanyang Technological University, Kui Ren Zhejiang University
DOI Pre-print Media Attached
16:30
15m
Talk
Resource Usage and Optimization Opportunities in Workflows of GitHub Actions
Research Track
Islem BOUZENIA University of Stuttgart, Michael Pradel University of Stuttgart
Pre-print
16:45
15m
Talk
Revealing Hidden Threats: An Empirical Study of Library Misuse in Smart Contracts
Research Track
Mingyuan Huang Sun Yat-Sen University, Jiachi Chen Sun Yat-sen University, Zigui Jiang Sun Yat-sen University, Zibin Zheng Sun Yat-sen University
17:00
7m
Talk
A Grounded Theory of Cross-community SECOs: Feedback Diversity vs. Synchronization
Journal-first Papers
Armstrong Foundjem Queens University, Ellis E. Eghan University of Cape Coast, Ghana, Bram Adams Queen's University
17:07
7m
Talk
Studying the Characteristics of AIOps Projects on GitHub
Journal-first Papers
Roozbeh Aghili Polytechnique Montréal, Heng Li Polytechnique Montréal, Foutse Khomh École Polytechnique de Montréal
17:14
7m
Talk
A First Look at Dark Mode in Real-World Android App
Journal-first Papers
Suyu Ma Monash University, Chunyang Chen Technical University of Munich (TUM), Hourieh Khalajzadeh Deakin University, Australia, John Grundy Monash University
Link to publication DOI Pre-print
17:21
7m
Talk
GitBug-Actions: Building Reproducible Bug-Fix Benchmarks with GitHub Actions
Demonstrations
Nuno Saavedra INESC-ID and IST, University of Lisbon, André Silva KTH Royal Institute of Technology, Martin Monperrus KTH Royal Institute of Technology