ICSE 2026
Sun 12 - Sat 18 April 2026 Rio de Janeiro, Brazil

The responsiveness of a Graphical User Interface (GUI) is paramount to user experience, with loading speed being a critical performance metric directly impacting user satisfaction and retention. Despite the great progress in automating large-scale UI test execution, accurately monitoring GUI loading duration remains time-consuming. The challenge lies in handling diverse app us- age environments and interference from UI animations. In this paper, we present PerFrame, a framework that introduces external recording to better align with the user experience and leverages a multimodal large language model (MLLM) to automate this process. Equipped with a robust, scenario-aware pipeline and an iterative sampling approach, PerFrame can significantly reduce the manual effort required to identify loading start and end frames from GUI testing screencasts. We evaluated PerFrame on real-world test screen recordings from Meituan, a leading e-commerce provider in China. PerFrame achieved a discrepancy of less than 5 frames for 61.6% and less than 20 frames for 85.6% of our test cases, representing a 27.7 percentage improvement over previous methods. By automating this crucial step, PerFrame accelerates GUI loading performance monitoring and contributes to improved testing efficiency and enhanced software reliability.