Enabling Cost-Effective UI Automation Testing with Retrieval-Based LLMs: A Case Study in WeChat (ASE 2024 - Industry Showcase)

Who

Sidong Feng, Haochuan Lu, Jianqin Jiang, Ting Xiong, Likun Huang, Yinglin Liang, Xiaoqin Li, Yuetang Deng, Aldeida Aleti

Track

ASE 2024 Industry Showcase

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 29 Oct 2024 14:30 - 14:40 at Carr - Web and UI Chair(s): Mattia Fazzini

Abstract

UI automation tests play a crucial role in ensuring the quality of mobile applications. Despite the growing popularity of machine learning techniques to generate these tests, they still face several challenges, such as the mismatch of UI elements. The recent advances in Large Language Models (LLMs) have addressed these issues by leveraging their semantic understanding capabilities. However, a significant gap remains in applying these models to industrial-level app testing, particularly in terms of cost optimization and knowledge limitation. To address this, we introduce CAT to create cost-effective UI automation tests for industry apps by combining machine learning and LLMs with best practices. Given the task description, CAT employs Retrieval Augmented Generation (RAG) to source examples of industrial app usage as the few-shot learning context, assisting LLMs in generating the specific sequence of actions. CAT then employs machine learning techniques, with LLMs serving as a complementary optimizer, to map the target element on the UI screen. Our evaluations on the WeChat testing dataset demonstrate the CAT’s performance and cost-effectiveness, achieving 90% UI automation with $0.34 cost, outperforming the state-of-the-art. We have also integrated our approach into the real-world WeChat testing platform, demonstrating its usefulness in detecting 141 bugs and enhancing the developers’ testing process.

Sidong Feng

Monash University

Australia

Haochuan Lu

Tencent

China

Jianqin Jiang

Tencent Inc.

Ting Xiong

Tencent Inc.

Likun Huang

Tencent Inc.

Yinglin Liang

Tencent Inc.

Xiaoqin Li

Tencent Inc.

Yuetang Deng

Tencent

China

Aldeida Aleti

Monash University

Australia

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 29 Oct
Displayed time zone: Pacific Time (US & Canada) change

13:30 - 15:00	Web and UIResearch Papers / Industry Showcase / Tool Demonstrations at Carr Chair(s): Mattia Fazzini University of Minnesota

13:30 15m Talk		Beyond Manual Modeling: Automating GUI Model Generation Using Design Documents Research Papers Shaoheng Cao Nanjing University, Renyi Chen Samsung Electronics（China）R&D Centre, Minxue Pan Nanjing University, Wenhua Yang Nanjing University of Aeronautics and Astronautics, Xuandong Li Nanjing University
13:45 15m Talk		Towards a Robust Waiting Strategy for Web GUI Testing for an Industrial Software System Industry Showcase Haonan Zhang University of Waterloo, Lizhi Liao Memorial University of Newfoundland, Zishuo Ding The Hong Kong University of Science and Technology (Guangzhou), Weiyi Shang University of Waterloo, Nidhi Narula ERA Environmental, Catalin Sporea ERA Environmental Management Solutions, Andrei Toma ERA Environmental Management Solutions, Sarah Sajedi ERA Environmental Management Solutions
14:00 15m Talk		Navigating Mobile Testing Evaluation: A Comprehensive Statistical Analysis of Android GUI Testing Metrics Research Papers Yuanhong Lan Nanjing University, Yifei Lu Nanjing University, Minxue Pan Nanjing University, Xuandong Li Nanjing University
14:15 15m Talk		Can Cooperative Multi-Agent Reinforcement Learning Boost Automatic Web Testing? An Exploratory Study Research Papers Yujia Fan Southern University of Science and Technology, Sinan Wang Southern University of Science and Technology, Zebang Fei Southern University of Science and Technology, Yao Qin Southern University of Science and Technology, Huaxuan Li Southern University of Science and Technology, Yepang Liu Southern University of Science and Technology
14:30 10m Talk		Enabling Cost-Effective UI Automation Testing with Retrieval-Based LLMs: A Case Study in WeChat Industry Showcase Sidong Feng Monash University, Haochuan Lu Tencent, Jianqin Jiang Tencent Inc., Ting Xiong Tencent Inc., Likun Huang Tencent Inc., Yinglin Liang Tencent Inc., Xiaoqin Li Tencent Inc., Yuetang Deng Tencent, Aldeida Aleti Monash University
14:40 10m Talk		Self-Elicitation of Requirements with Automated GUI Prototyping Tool Demonstrations Kristian Kolthoff Institute for Enterprise Systems (InES), University Of Mannheim, Christian Bartelt , Simone Paolo Ponzetto Data and Web Science Group, University of Mannheim, Kurt Schneider Leibniz Universität Hannover, Software Engineering Group DOI Pre-print Media Attached