Beyond Static GUI Agent: Evolving LLM-based GUI Testing via Dynamic Memory (ASE 2025 - Research Papers)

Who

Mengzhuo Chen, Zhe Liu, Chunyang Chen, Junjie Wang, Yangguang Xue, Boyu Wu, Yuekai Huang, Libin Wu, Qing Wang

Track

ASE 2025 Research Papers

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 19 Nov 2025 14:30 - 14:40 at Grand Hall 3 - Web & Mobile Systems 2

Abstract

The development of Large Language Models (LLMs) enables LLM-based GUI testing to interact with graphical user interfaces by understanding GUI screenshots and generating actions, which are widely applied in industry and academia. However, current approaches test each app in isolation, lacking mechanisms for experience accumulation and reuse. This limitation often causes GUI testing approaches to miss deeper exploration and fail to trigger bug-prone functionalities. To address this, we propose MemoDroid, a three-layer memory mechanism that augments LLM-based GUI testing with the ability to evolve through repeated interaction. MemoDroid designs episodic memory to capture functional-level testing traces, reflective memory to summarize failure patterns and redundant behaviors, and strategic memory to synthesize cross-app exploration strategies. These memory layers are dynamically retrieved and injected into LLM prompts at runtime, enabling the agent to reuse successful behaviors, avoid ineffective actions, and prioritize bug-prone paths. We implement MemoDroid as a lightweight plugin, which can be integrated into existing LLM-based GUI testing approaches. We evaluate MemoDroid on real-world apps from 15 diverse categories. Results show that MemoDroid enhances GUI testing performance across five baseline methods, with activity and code coverage increasing by 79% - 96% and 81% - 97%, and bug detection improving by 57% - 198%. Ablation studies confirm the contributions of each memory layer. Furthermore, MemoDroid detects 49 new bugs in 200 real-world apps, with 35 confirmed fixes and 14 acknowledged by developers, showing its practical value in memory-driven GUI testing.

Mengzhuo Chen

Institute of Software, Chinese Academy of Sciences

Zhe Liu

Institute of Software, Chinese Academy of Sciences

China

Chunyang Chen

TU Munich

Germany

Junjie Wang

Institute of Software at Chinese Academy of Sciences

China

Yangguang Xue

University of Chinese Academy of Sciences

Boyu Wu

Institute of Software at Chinese Academy of Sciences

China

Yuekai Huang

Institute of Software, Chinese Academy of Sciences

Libin Wu

Institute of Software Chinese Academy of Sciences

Qing Wang

Institute of Software at Chinese Academy of Sciences

China

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+09:00) Seoul.

Use conference time zone: (GMT+09:00) SeoulSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 19 Nov
Displayed time zone: Seoul change

14:00 - 15:30	Web & Mobile Systems 2Research Papers / Journal-First Track at Grand Hall 3

14:00 10m Talk		Adaptive and accessible user interfaces for seniors through model-driven engineering Journal-First Track Shavindra Wickramathilaka Monash University, John Grundy Monash University, Kashumi Madampe Monash University, Australia, Omar Haggag Monash University, Australia Link to publication DOI
14:10 10m Talk		AppBDS: LLM-Powered Description Synthesis for Sensitive Behaviors in Mobile Apps Research Papers Zichen Liu Arizona State University, Xusheng Xiao Arizona State University
14:20 10m Talk		Large Language Models for Automated Web-Form-Test Generation: An Empirical Study Journal-First Track Tao Li Macau University of Science and Technology, Chenhui Cui Macau University of Science and Technology, Rubing Huang Macau University of Science and Technology (M.U.S.T.), Dave Towey University of Nottingham Ningbo China, Lei Ma The University of Tokyo & University of Alberta
14:30 10m Talk		Beyond Static GUI Agent: Evolving LLM-based GUI Testing via Dynamic Memory Research Papers Mengzhuo Chen Institute of Software, Chinese Academy of Sciences, Zhe Liu Institute of Software, Chinese Academy of Sciences, Chunyang Chen TU Munich, Junjie Wang Institute of Software at Chinese Academy of Sciences, Yangguang Xue University of Chinese Academy of Sciences, Boyu Wu Institute of Software at Chinese Academy of Sciences, Yuekai Huang Institute of Software, Chinese Academy of Sciences, Libin Wu Institute of Software Chinese Academy of Sciences, Qing Wang Institute of Software at Chinese Academy of Sciences
14:40 10m Talk		Who's to Blame? Rethinking the Brittleness of Automated Web GUI Testing from a Pragmatic Perspective Research Papers Haonan Zhang University of Waterloo, Kundi Yao University of Waterloo, Zishuo Ding The Hong Kong University of Science and Technology (Guangzhou), Lizhi Liao Memorial University of Newfoundland, Weiyi Shang University of Waterloo
14:50 10m Talk		LLM-Cure: LLM-based Competitor User Review Analysis for Feature Enhancement Journal-First Track Maram Assi Université du Québec à Montréal, Safwat Hassan University of Toronto, Ying Zou Queen's University, Kingston, Ontario
15:00 10m Talk		MIMIC: Integrating Diverse Personality Traits for Better Game Testing Using Large Language Model Research Papers Yifei Chen McGill University, Sarra Habchi Cohere, Canada, Lili Wei McGill University Pre-print
15:10 10m Talk		Debun: Detecting Bundled JavaScript Libraries on Web using Property-Order Graphs Research Papers Seojin Kim North Carolina State University, Sungmin Park Korea University, Jihyeok Park Korea University
15:20 10m Talk		GUIFuzz++: Unleashing Grey-box Fuzzing on Desktop Graphical User Interfacing Applications Research Papers Dillon Otto University of Utah, Tanner Rowlett University of Utah, Stefan Nagy University of Utah Pre-print