Who's to Blame? Rethinking the Brittleness of Automated Web GUI Testing from a Pragmatic Perspective
This program is tentative and subject to change.
Automated web GUI testing is important for software quality, however, its effectiveness is often undermined by test case brittleness, especially in continuously evolving real-world applications. In this experience paper, we pragmatically investigate the root causes of brittleness. We first analyze why legacy test cases, derived from the Mind2Web dataset, fail when executed on current web application versions. Our findings reveal that brittleness stems from multifaceted factors, including test script design, web application complexity, and automation framework limitations. A longitudinal study further shows that 81.7% of repaired tests break again within six months, primarily due to similar recurring issues, highlighting the persistent nature of brittleness. We further demonstrate that Large Language Models, when provided with human-like diagnostic context, can successfully repair a substantial portion of these brittle tests, though human expertise remains important for more complex scenarios. Our findings emphasize that brittleness is a multifaceted problem requiring collaboration between different parts involved in the automation testing.
This program is tentative and subject to change.
Wed 19 NovDisplayed time zone: Seoul change
14:00 - 15:30 | |||
14:00 10mTalk | Adaptive and accessible user interfaces for seniors through model-driven engineering Journal-First Track Shavindra Wickramathilaka Monash University, John Grundy Monash University, Kashumi Madampe Monash University, Australia, Omar Haggag Monash University, Australia | ||
14:10 10mTalk | AppBDS: LLM-Powered Description Synthesis for Sensitive Behaviors in Mobile Apps Research Papers | ||
14:20 10mTalk | Large Language Models for Automated Web-Form-Test Generation: An Empirical Study Journal-First Track Tao Li Macau University of Science and Technology, Chenhui Cui Macau University of Science and Technology, Rubing Huang Macau University of Science and Technology (M.U.S.T.), Dave Towey University of Nottingham Ningbo China, Lei Ma The University of Tokyo & University of Alberta | ||
14:30 10mTalk | Beyond Static GUI Agent: Evolving LLM-based GUI Testing via Dynamic Memory Research Papers Mengzhuo Chen Institute of Software, Chinese Academy of Sciences, Zhe Liu Institute of Software, Chinese Academy of Sciences, Chunyang Chen TU Munich, Junjie Wang Institute of Software at Chinese Academy of Sciences, Yangguang Xue University of Chinese Academy of Sciences, Boyu Wu Institute of Software at Chinese Academy of Sciences, Yuekai Huang Institute of Software, Chinese Academy of Sciences, Libin Wu Institute of Software Chinese Academy of Sciences, Qing Wang Institute of Software at Chinese Academy of Sciences | ||
14:40 10mTalk | Who's to Blame? Rethinking the Brittleness of Automated Web GUI Testing from a Pragmatic Perspective Research Papers Haonan Zhang University of Waterloo, Kundi Yao University of Waterloo, Zishuo Ding The Hong Kong University of Science and Technology (Guangzhou), Lizhi Liao Memorial University of Newfoundland, Weiyi Shang University of Waterloo | ||
14:50 10mTalk | LLM-Cure: LLM-based Competitor User Review Analysis for Feature Enhancement Journal-First Track Maram Assi Université du Québec à Montréal, Safwat Hassan University of Toronto, Ying Zou Queen's University, Kingston, Ontario | ||
15:00 10mTalk | MIMIC: Integrating Diverse Personality Traits for Better Game Testing Using Large Language Model Research Papers Pre-print | ||
15:10 10mTalk | Debun: Detecting Bundled JavaScript Libraries on Web using Property-Order Graphs Research Papers Seojin Kim North Carolina State University, Sungmin Park Korea University, Jihyeok Park Korea University | ||
15:20 10mTalk | GUIFuzz++: Unleashing Grey-box Fuzzing on Desktop Graphical User Interfacing Applications Research Papers Pre-print |