SOCRATEST- Towards Autonomous Testing Agents via Conversational Large Language Models
Software testing is an important part of the development cycle, yet it requires specialized expertise and substantial developer effort to adequately test software. The recent discoveries of the capabilities of large language models (LLMs) suggest that they can be used as automated testing assistants, and thus provide helpful information and even drive the testing process. To highlight the potential of this technology, we present a taxonomy of LLM-based testing agents based on their level of autonomy, and describe how a greater level of autonomy can benefit developers in practice. An example use of LLMs as a testing assistant is provided to demonstrate how a conversational framework for testing can help developers. This also highlights how the often criticized hallucination of LLMs can be beneficial while testing. We identify other tangible benefits that LLM-driven testing agents can bestow, and also discuss some potential limitations.
slides (10.42 Shin Yoo.pdf) | 18.80MiB |
Tue 12 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:30 - 12:00 | Testing AI Systems 1NIER Track / Research Papers at Room C Chair(s): Leonardo Mariani University of Milano-Bicocca | ||
10:30 12mTalk | Nuances are the Key: Unlocking ChatGPT to Find Failure-Inducing Tests with Differential Prompting Research Papers Li Tsz On The Hong Kong University of Science and Technology, Wenxi Zong Northeastern University, Yibo Wang Northeastern University, Haoye Tian University of Luxembourg, Ying Wang Northeastern University, Shing-Chi Cheung Hong Kong University of Science and Technology, Jeffrey Kramer Imperial College London Pre-print | ||
10:42 12mTalk | SOCRATEST- Towards Autonomous Testing Agents via Conversational Large Language Models NIER Track Robert Feldt Chalmers University of Technology, Sweden, Sungmin Kang KAIST, Juyeon Yoon Korea Advanced Institute of Science and Technology, Shin Yoo KAIST Pre-print File Attached | ||
10:54 12mResearch paper | Semantic Data Augmentation for Deep Learning Testing using Generative AI NIER Track sondess missaoui University of York, Simos Gerasimou University of York, Nicholas Matragkas Université Paris-Saclay, CEA, List. File Attached | ||
11:06 12mTalk | Robin: A Novel Method to Produce Robust Interpreters for Deep Learning-Based Code Classifiers Research Papers Zhen Li Huazhong University of Science and Technology, Ruqian Zhang Huazhong University of Science and Technology, Deqing Zou Huazhong University of Science and Technology, Ning Wang Huazhong University of Science and Technology, Yating Li Huazhong University of Science and Technology, Shouhuai Xu University of Colorado Colorado Springs, Chen Chen University of Central Florida, Hai Jin Huazhong University of Science and Technology, Yating Li Huazhong University of Science and Technology Pre-print | ||
11:18 12mTalk | The Devil is in the Tails: How Long-Tailed Code Distributions Impact Large Language Models Research Papers Xin Zhou Singapore Management University, Singapore, Kisub Kim Singapore Management University, Singapore, Bowen Xu North Carolina State University, Jiakun Liu Singapore Management University, DongGyun Han Royal Holloway, University of London, David Lo Singapore Management University Pre-print | ||
11:30 12mTalk | CertPri: Certifiable Prioritization for Deep Neural Networks via Movement Cost in Feature SpaceRecorded talk Research Papers haibin zheng Zhejiang University of Technology, Jinyin Chen College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China, Haibo Jin Zhejiang University of Techonology Pre-print Media Attached |