QuLog: Data-Driven Approach for Log Instruction Quality Assessment (ICPC 2022 - Research)

Who

Jasmin Bogatinovski, Sasho Nedelkoski , Alexander Acker, Jorge Cardoso, Odej Kao

Track

ICPC 2022 Research

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 16 May 2022 11:31 - 11:38 at ICPC room - Session 7: Debugging 1 Chair(s): Kevin Moran

Abstract

In the current IT world, developers write code while system operators run the code mostly as a black box. The connection between both worlds is typically established with log messages: the developer provides hints to the (unknown) operator, where the cause of an occurred issue is, and vice versa, the operator can report bugs during operation. To fulfil this purpose, developers write log instructions that are structured text commonly composed of a log level (e.g., “info", “error”), static text (“IP {} cannot be reached”), and dynamic variables (e.g. IP {}). However, opposed to well-adopted coding practices, there are no widely adopted guidelines on how to write log instructions with good quality properties. For example, a developer may assign a high log level (e.g., “error”) for a trivial event that can confuse the operator and increase maintenance costs. Or the static text can be insufficient to hint at a specific issue. In this paper, we address the problem of log quality assessment and provide the first step towards its automation. We start with an in-depth analysis of quality log instruction properties in nine software systems and identify two quality properties: 1) correct log level assignment assessing the correctness of the log level, and 2) sufficient linguistic structure assessing the minimal richness of the static text necessary for verbose event description. Based on these findings, we developed a data-driven approach that adapts deep learning methods for each of the two properties. An extensive evaluation of nine large-scale open-source systems shows that our approach correctly assesses log level assignments with an accuracy of 0.88, and the sufficient linguistic structure with an F1 score of 0.99, outperforming the baselines. Our study highlights the potential of the data-driven methods in assessing log instructions quality and aid developers in comprehending and writing better code.

Jasmin Bogatinovski

Technical University Berlin

Sasho Nedelkoski

TU Berlin

Alexander Acker

Technical University Berlin

Jorge Cardoso

Huawei Munich Research Center

Odej Kao

Technische Universität Berlin

Media

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 16 May
Displayed time zone: Eastern Time (US & Canada) change

11:10 - 12:10	Session 7: Debugging 1Research at ICPC room Chair(s): Kevin Moran George Mason University

11:10 7m Talk		Causette: User-Controlled Rearrangement of Causal Constructs in a Code Editor Research Alice Martin ENAC - Université de Toulouse, Mathieu Magnaudet ENAC - Université de Toulouse, Stéphane Conversy ENAC - Université de Toulouse File Attached
11:17 7m Talk		Error Identification Strategies for Python Jupyter Notebooks Research Derek Robinson University of Victoria, Neil Ernst University of Victoria, Enrique Larios Vargas University of Victoria, Margaret-Anne Storey University of Victoria Media Attached
11:24 7m Talk		Performance Anomaly Detection through Sequence Alignment of System-Level Traces Research Madeline Janecek Brock University, Naser Ezzati Jivan , Wahab Hamou-Lhadj Concordia University, Montreal, Canada Media Attached
11:31 7m Talk		QuLog: Data-Driven Approach for Log Instruction Quality Assessment Research Jasmin Bogatinovski Technical University Berlin, Sasho Nedelkoski TU Berlin, Alexander Acker Technical University Berlin, Jorge Cardoso Huawei Munich Research Center, Odej Kao Technische Universität Berlin Media Attached
11:38 7m Talk		Fixing Continuous Integration Tests From Within the IDE With Contextual Information Research Casper Boone Delft University of Technology, Carolin Brandt Delft University of Technology, Andy Zaidman Delft University of Technology DOI Pre-print Media Attached
11:45 7m Talk		Shape-Analysis Driven Memory Graph Visualization Research Jan H. Boockmann University of Bamberg, Gerald Lüttgen University of Bamberg Media Attached
11:52 18m Live Q&A		Q&A-Paper Session 7 Research

Information for Participants

Mon 16 May 2022 11:10 - 12:10 at ICPC room - Session 7: Debugging 1 Chair(s): Kevin Moran

Info for room ICPC room:

Click here to go to the room on Midspace