An empirical study of web flaky tests: Understanding and unveiling DOM event interaction challenges
This program is tentative and subject to change.
Flaky tests, which exhibit non-deterministic behavior and fail without changes to the codebase, pose significant challenges to the reliability and efficiency of software testing processes. Despite extensive research on flaky tests in traditional unit and integration testing, their impact and prevalence within web user interface (UI) testing remains relatively unexplored, especially concerning Document Object Model (DOM) events. In web applications, DOM-related flakiness, resulting from unstable interactions between DOM and events, is particularly prevalent. This study conducts an empirical analysis of 123 flaky tests in 49 open-source web projects, focusing on the correlation between DOM event interactions and test flakiness.
Our findings indicate that DOM events, and their associated interactions with the application, can introduce flakiness in web UI tests; these events are frequently associated with Event-DOM interactions (32.5%), Event operations (22.8%), and Response evaluations (16.3%). The analysis of DOM consistency and event interaction levels reveals that element-level interactions across multiple DOMs are more likely to cause flakiness than interactions confined to a single DOM or occurring at the page level. Furthermore, the primary strategies used by developers to handle these issues involve synchronizing DOM interactions (50.4%), managing conditional event completion (38.2%), and ensuring consistent DOM state transitions (11.4%). We discovered that the Event-DOM category has the highest fixed frequency (2.6 times), while the DOM category on sole takes the longest time to resolve (153.4 days). This study provides practical insights into improving web application testing practices by highlighting the importance of understanding and managing DOM event interactions.