TCSE logo 
 Sigsoft logo
Sustainability badge
Fri 2 May 2025 17:00 - 17:15 at 215 - SE for AI with Quality 3 Chair(s): Sumon Biswas

Testing deep learning (DL) systems requires extensive and diverse, yet valid, test inputs. While synthetic test input generation methods, such as metamorphic testing, are widely used for DL testing, they risk introducing invalid inputs that do not accurately reflect real-world scenarios. Invalid test inputs can lead to misleading results. Hence, there is a need for automated validation of test inputs to ensure effective assessment of DL systems. In this paper, we propose a test input validation approach for vision-based DL systems. Our approach uses active learning to balance the trade-off between accuracy and the manual effort required for test input validation. Further, by employing multiple image-comparison metrics, it achieves better results in classifying valid and invalid test inputs compared to methods that rely on single metrics. We evaluate our approach using an industrial and a public-domain dataset. Our evaluation shows that our multi-metric, active learning-based approach produces several optimal accuracy-effort trade-offs, including those deemed practical and desirable by our industry partner. Furthermore, provided with the same level of manual effort, our approach is significantly more accurate than two state-of-the-art test input validation methods, achieving an average accuracy of 97%. Specifically, the use of multiple metrics, rather than a single metric, results in an average improvement of at least 5.4% in overall accuracy compared to the state-of-the-art baselines. Incorporating an active learning loop for test input validation yields an additional 7.5% improvement in average accuracy, bringing the overall average improvement of our approach to at least 12.9% compared to the baselines.

Fri 2 May

Displayed time zone: Eastern Time (US & Canada) change

16:00 - 17:30
SE for AI with Quality 3Research Track / SE In Practice (SEIP) at 215
Chair(s): Sumon Biswas Case Western Reserve University
16:00
15m
Talk
Improved Detection and Diagnosis of Faults in Deep Neural Networks Using Hierarchical and Explainable ClassificationSE for AIArtifact-Available
Research Track
Sigma Jahan Dalhousie University, Mehil Shah Dalhousie University, Parvez Mahbub Dalhousie University, Masud Rahman Dalhousie University
Pre-print
16:15
15m
Talk
Lightweight Concolic Testing via Path-Condition Synthesis for Deep Learning LibrariesSE for AIArtifact-FunctionalArtifact-AvailableArtifact-Reusable
Research Track
16:30
15m
Talk
Mock Deep Testing: Toward Separate Development of Data and Models for Deep LearningSE for AI
Research Track
Ruchira Manke Tulane University, USA, Mohammad Wardat Oakland University, USA, Foutse Khomh Polytechnique Montréal, Hridesh Rajan Tulane University
16:45
15m
Talk
RUG: Turbo LLM for Rust Unit Test GenerationSE for AI
Research Track
Xiang Cheng Georgia Institute of Technology, Fan Sang Georgia Institute of Technology, Yizhuo Zhai Georgia Institute of Technology, Xiaokuan Zhang George Mason University, Taesoo Kim Georgia Institute of Technology
Pre-print Media Attached File Attached
17:00
15m
Talk
Test Input Validation for Vision-based DL Systems: An Active Learning ApproachArtifact-AvailableArtifact-FunctionalArtifact-ReusableSE for AI
SE In Practice (SEIP)
Delaram Ghobari University of Ottawa, Mohammad Hossein Amini University of Ottawa, Dai Quoc Tran SmartInsideAI Company Ltd. and Sungkyunkwan University, Seunghee Park SmartInsideAI Company Ltd. and Sungkyunkwan University, Shiva Nejati University of Ottawa, Mehrdad Sabetzadeh University of Ottawa
Pre-print
17:15
15m
Talk
SEMANTIC CODE FINDER: An Efficient Semantic Search Framework for Large-Scale Codebases
SE In Practice (SEIP)
daeha ryu Innovation Center, Samsung Electronics, Seokjun Ko Samsung Electronics Co., Eunbi Jang Innovation Center, Samsung Electronics, jinyoung park Innovation Center, Samsung Electronics, myunggwan kim Innovation Center, Samsung Electronics, changseo park Innovation Center, Samsung Electronics
:
:
:
: