Efficient Detection of Intermittent Job Failures Using Few-Shot Learning (ICSME 2025 - Industry Track) - ICSME 2025 - International Conference on Software Maintenance and Evolution

Who

Henri Aïdasso, Francis Bordeleau, Ali Tizghadam

Track

ICSME 2025 Industry Track

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 10 Sep 2025 11:30 - 11:45 at Case Room 2 260-057 - Session 2 - Quality Assurance 1 Chair(s): Coen De Roover

Abstract

One of the main challenges developers face in the use of continuous integration (CI) and deployment pipelines is the occurrence of intermittent job failures, which result from unexpected non-deterministic issues (e.g., flaky tests or infrastructure problems) rather than regular code-related errors such as bugs. Prior studies developed machine-learning (ML) models trained on large datasets of job logs to classify job failures as either intermittent or regular. As an alternative to costly manual labeling of large datasets, the state-of-the-art (SOTA) approach leveraged a heuristic based on non-deterministic job reruns. However, this method mislabels intermittent job failures as regular in contexts where rerunning suspicious job failures is not an explicit policy, and therefore limits the SOTA’s performance in practice. In fact, our manual analysis of 2,125 job failures from 5 industrial and 1 open-source projects reveals that, on average, 32% of intermittent job failures are mislabeled as regular. To address these limitations, this paper introduces a novel approach to intermittent job failure detection using few-shot learning (FSL). Specifically, we fine-tune a small language model using a few number of manually labeled log examples to generate rich embeddings, which are then used to train an ML classifier. Our FSL-based approach achieves 70-88% F1-score with only 12 shots in all projects, outperforming the SOTA, which proved ineffective (34-52% F1-score) in 4 projects. Overall, this study underlines the importance of data quality over quantity and provides a more efficient and practical framework for the detection of intermittent job failures in organizations.

Link to Preprint

https://arxiv.org/abs/2507.04173

Henri Aïdasso

École de technologie supérieure (ÉTS)

Canada

Francis Bordeleau

École de Technologie Supérieure (ETS)

Canada

Ali Tizghadam

TELUS

Canada

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 10 Sep
Displayed time zone: Auckland, Wellington change

10:30 - 12:00	Session 2 - Quality Assurance 1Tool Demonstration Track / Research Papers Track / Industry Track / NIER Track / Journal First Track at Case Room 2 260-057 Chair(s): Coen De Roover Vrije Universiteit Brussel

10:30 15m		A Jump-Table-Agnostic Switch Recovery on ASTs Research Papers Track Steffen Enders Fraunhofer FKIE, Eva-Maria Behner Fraunhofer FKIE, Elmar Padilla Fraunhofer FKIE
10:45 15m		Quantization Is Not a Dealbreaker: Empirical Insights from Large Code Models Research Papers Track Saima Afrin William & Mary, Antonio Mastropaolo William and Mary, USA, Bowen Xu North Carolina State University Pre-print
11:00 10m		AI-Powered Commit Explorer (APCE) Tool Demonstration Track Yousab Grees Belmont University, Polina Iaremchuk Belmont University, Ramtin Ehsani Drexel University, Esteban Parra Rodriguez Belmont University, Preetha Chatterjee Drexel University, USA, Sonia Haiduc Florida State University Pre-print
11:10 10m		JDala - A Simple Capability System for Java Tool Demonstration Track Quinten Smit Victoria University of Wellington, Jens Dietrich Victoria University of Wellington, Michael Homer Victoria University of Wellington, Andrew Fawcet Victoria University of Wellington, James Noble Independent. Wellington, NZ
11:20 10m		ExpertCache: GPU-Efficient MoE Inference through Reinforcement Learning-Guided Expert Selection NIER Track Xunzhu Tang University of Luxembourg, Tiezhu Sun University of Luxembourg, Yewei Song University of Luxembourg, SiYuanMa , Jacques Klein University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg
11:30 15m		Efficient Detection of Intermittent Job Failures Using Few-Shot Learning Industry Track Henri Aïdasso École de technologie supérieure (ÉTS), Francis Bordeleau École de Technologie Supérieure (ETS), Ali Tizghadam TELUS Pre-print
11:45 15m		LogOW: A Semi-Supervised Log Anomaly Detection Model in Open-World Setting Journal First Track Jingwei Ye Nankai University, Chunbo Liu Civil Aviation University of China, Zhaojun Gu Civil Aviation University of China, Zhikai Zhang Civil Aviation University of China, Xuying Meng The Institute of Computing Technology, Chinese Academy of Sciences, Weiyao Zhang The Institute of Computing Technology, Chinese Academy of Sciences, Yujun Zhang The Institute of Computing Technology, Chinese Academy of Sciences