Near-Duplicate Build Failure Detection from Continuous Integration Logs
Addressing build failures is important in software development, particularly within Continuous Integration and Deployment (CI/CD) pipelines. A common issue arises when a developer encounters a new build failure that has occurred in the past. We refer to such build failures as near-duplicate build failures. We collect build logs from a GitHub Actions CI/CD pipeline. We label near-duplicate build failures by identifying those that have the same failed tests. We propose a framework for detecting near-duplicate build failures through log similarity analysis. As the majority of log lines in failed builds are similar to passing logs, we propose using an Out-of-Vocabulary Detector (OOVD) filtering to identify only failure-relevant lines and to improve near-duplicate detection accuracy. Our results suggest a noticeable improvement using OOVD filtering across all metrics for Top-𝐾 results (𝐾 = 5, precision@K: 0.864 vs. 0.526 and MAP@K: 0.941 vs. 0.814). Finding near-duplicate build failures can be an important software engineering challenge that to our best knowledge has not been studied in the past. Upon acceptance of the paper we will make the scripts and the data available.
Thu 26 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
14:00 - 15:30 | |||
14:00 60mKeynote | Keynote 2 (Dr. Haipeng Cai) PROMISE 2025 Haipeng Cai University at Buffalo, SUNY | ||
15:01 14mTalk | A Qualitative Investigation into LLM-Generated Multilingual Code Comments and Automatic Evaluation Metrics PROMISE 2025 Jonathan Katzy Delft University of Technology, Yongcheng Huang Delft University of Technology, Gopal-Raj Panchu Delft University of Technology, Maksym Ziemlewski Delft University of Technology, Paris Loizides Delft University of Technology, Sander Vermeulen Delft University of Technology, Arie van Deursen TU Delft, Maliheh Izadi Delft University of Technology Pre-print | ||
15:16 9mTalk | Near-Duplicate Build Failure Detection from Continuous Integration Logs PROMISE 2025 Mingchen Li University of Helsinki, Mika Mäntylä University of Helsinki and University of Oulu, Jesse Nyyssölä University of Helsinki, Matti Luukkainen University of Helsinki |