Enabling Cost-Effective UI Automation Testing with Retrieval-Based LLMs: A Case Study in WeChat
UI automation tests play a crucial role in ensuring the quality of mobile applications. Despite the growing popularity of machine learning techniques to generate these tests, they still face several challenges, such as the mismatch of UI elements. The recent advances in Large Language Models (LLMs) have addressed these issues by leveraging their semantic understanding capabilities. However, a significant gap remains in applying these models to industrial-level app testing, particularly in terms of cost optimization and knowledge limitation. To address this, we introduce CAT to create cost-effective UI automation tests for industry apps by combining machine learning and LLMs with best practices. Given the task description, CAT employs Retrieval Augmented Generation (RAG) to source examples of industrial app usage as the few-shot learning context, assisting LLMs in generating the specific sequence of actions. CAT then employs machine learning techniques, with LLMs serving as a complementary optimizer, to map the target element on the UI screen. Our evaluations on the WeChat testing dataset demonstrate the CAT’s performance and cost-effectiveness, achieving 90% UI automation with $0.34 cost, outperforming the state-of-the-art. We have also integrated our approach into the real-world WeChat testing platform, demonstrating its usefulness in detecting 141 bugs and enhancing the developers’ testing process.
Tue 29 OctDisplayed time zone: Pacific Time (US & Canada) change
13:30 - 15:00 | Web and UIResearch Papers / Industry Showcase / Tool Demonstrations at Carr Chair(s): Mattia Fazzini University of Minnesota | ||
13:30 15mTalk | Beyond Manual Modeling: Automating GUI Model Generation Using Design Documents Research Papers Shaoheng Cao Nanjing University, Renyi Chen Samsung Electronics(China)R&D Centre, Minxue Pan Nanjing University, Wenhua Yang Nanjing University of Aeronautics and Astronautics, Xuandong Li Nanjing University | ||
13:45 15mTalk | Towards a Robust Waiting Strategy for Web GUI Testing for an Industrial Software System Industry Showcase Haonan Zhang University of Waterloo, Lizhi Liao Memorial University of Newfoundland, Zishuo Ding The Hong Kong University of Science and Technology (Guangzhou), Weiyi Shang University of Waterloo, Nidhi Narula ERA Environmental, Catalin Sporea ERA Environmental Management Solutions, Andrei Toma ERA Environmental Management Solutions, Sarah Sajedi ERA Environmental Management Solutions | ||
14:00 15mTalk | Navigating Mobile Testing Evaluation: A Comprehensive Statistical Analysis of Android GUI Testing Metrics Research Papers Yuanhong Lan Nanjing University, Yifei Lu Nanjing University, Minxue Pan Nanjing University, Xuandong Li Nanjing University | ||
14:15 15mTalk | Can Cooperative Multi-Agent Reinforcement Learning Boost Automatic Web Testing? An Exploratory Study Research Papers Yujia Fan Southern University of Science and Technology, Sinan Wang Southern University of Science and Technology, Zebang Fei Southern University of Science and Technology, Yao Qin Southern University of Science and Technology, Huaxuan Li Southern University of Science and Technology, Yepang Liu Southern University of Science and Technology | ||
14:30 10mTalk | Enabling Cost-Effective UI Automation Testing with Retrieval-Based LLMs: A Case Study in WeChat Industry Showcase Sidong Feng Monash University, Haochuan Lu Tencent, Jianqin Jiang Tencent Inc., Ting Xiong Tencent Inc., Likun Huang Tencent Inc., Yinglin Liang Tencent Inc., Xiaoqin Li Tencent Inc., Yuetang Deng Tencent, Aldeida Aleti Monash University | ||
14:40 10mTalk | Self-Elicitation of Requirements with Automated GUI Prototyping Tool Demonstrations Kristian Kolthoff Institute for Enterprise Systems (InES), University Of Mannheim, Christian Bartelt , Simone Paolo Ponzetto Data and Web Science Group, University of Mannheim, Kurt Schneider Leibniz Universität Hannover, Software Engineering Group DOI Pre-print Media Attached |