Semantic Data Augmentation for Deep Learning Testing using Generative AI
The performance of state-of-the-art Deep Learning models heavily depends on the availability of well-curated training and testing datasets that sufficiently capture the operational domain. Data augmentation is an effective technique in alleviating data scarcity, reducing the time-consuming and expensive data collection and labelling processes. Despite their potential, existing data augmentation techniques primarily focus on simple geometric and colour space transformations, like noise, flipping and resizing, producing datasets with limited diversity. When the augmented dataset is used for testing the Deep Learning models, the derived results are typically uninformative about the robustness of the models. We address this gap by introducing GENFUZZER, a novel coverage-guided data augmentation fuzzing technique for Deep Learning models underpinned by generative AI. We demonstrate our approach using widely-adopted datasets and models employed for image classification, illustrating its effectiveness in generating informative datasets leading up to a 26% increase in widely-used coverage criteria
| Semantic Data Augmentation for Deep Learning Testing using Generative AI (ASE23_NIER_130.pdf) | 1.27MiB | 
Tue 12 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
| 10:30 - 12:00 | Testing AI Systems 1NIER Track / Research Papers at Room C Chair(s): Leonardo Mariani University of Milano-Bicocca | ||
| 10:3012m Talk | Nuances are the Key: Unlocking ChatGPT to Find Failure-Inducing Tests with Differential Prompting Research Papers Li Tsz On The Hong Kong University of Science and Technology, Wenxi Zong Northeastern University, Yibo Wang Northeastern University, Haoye Tian University of Luxembourg, Ying Wang Northeastern University, Shing-Chi Cheung Hong Kong University of Science and Technology, Jeffrey Kramer Imperial College LondonPre-print | ||
| 10:4212m Talk | SOCRATEST- Towards Autonomous Testing Agents via Conversational Large Language Models NIER Track Robert Feldt Chalmers University of Technology, Sweden, Sungmin Kang KAIST, Juyeon Yoon Korea Advanced Institute of Science and Technology, Shin Yoo KAISTPre-print File Attached | ||
| 10:5412m Research paper | Semantic Data Augmentation for Deep Learning Testing using Generative AI NIER Track sondess missaoui University of York, Simos Gerasimou University of York, Nicholas Matragkas Université Paris-Saclay, CEA, List.File Attached | ||
| 11:0612m Talk | Robin: A Novel Method to Produce Robust Interpreters for Deep Learning-Based Code Classifiers Research Papers Zhen Li Huazhong University of Science and Technology, Ruqian Zhang Huazhong University of Science and Technology, Deqing Zou Huazhong University of Science and Technology, Ning Wang Huazhong University of Science and Technology, Yating Li Huazhong University of Science and Technology, Shouhuai Xu University of Colorado Colorado Springs, Chen Chen University of Central Florida, Hai Jin Huazhong University of Science and Technology, Yating Li Huazhong University of Science and TechnologyPre-print | ||
| 11:1812m Talk | The Devil is in the Tails: How Long-Tailed Code Distributions Impact Large Language Models Research Papers Xin Zhou Singapore Management University, Singapore, Kisub Kim Singapore Management University, Singapore, Bowen Xu North Carolina State University, Jiakun Liu Singapore Management University, DongGyun Han Royal Holloway, University of London, David Lo Singapore Management UniversityPre-print | ||
| 11:3012m Talk | CertPri: Certifiable Prioritization for Deep Neural Networks via Movement Cost in Feature SpaceRecorded talk Research Papers haibin zheng Zhejiang University of Technology, Jinyin Chen College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China, Haibo Jin Zhejiang University of TechonologyPre-print Media Attached | ||

