ICTSS 2025
Wed 17 - Fri 19 September 2025 Limassol, Cyprus
co-located with ECSA 2025

Static code analysis conducted by means of learning-based methods is an essential part of Security Testing. Effective learning algorithms are crucial for training reliable models that can accurately detect weaknesses and vulnerabilities. During models’ training, however, it is also of paramount importance to use adequate datasets of vulnerable and non-vulnerable source code.

Most existing learning-based methods have been evaluated by applying them to public datasets of code fragments labeled as vulnerable and nonvulnerable. However, it is recognized that such datasets contain spurious entries, and are often imbalanced, i.e., contain a large portion of nonvulnerable code. While the first issue is often fixed with a pre-processing of data cleaning operations, the second one is almost ignored.

This paper reports a preliminary study that investigates the effect of adopting imbalanced datasets and imbalance techniques on the performance of learning-based vulnerability detection methods. Our results show that (i) resampling approaches, in particular, a combination of over and under sampling, can generate reliable models and corroborate the results; and (ii) imbalance loss functions can improve the performance in case of very imbalanced and variegated datasets.

Wed 17 Sep

Displayed time zone: Athens change

16:00 - 17:40
Automated Test Generation and AI-Driven TestingGeneral Track at Atrium C
Chair(s): Tolgahan Bardakci University of Antwerp and Flanders Make
16:00
30m
Talk
On the evaluation of test suites generated by large language models
General Track
Matej Cuze Graz University of Technology, Franz Wotawa Technische Universitaet Graz
16:30
30m
Talk
On the use of imbalanced datasets for learning-based vulnerability detection
General Track
ROSMAEL ZIDANE LEKEUFACK FOULEFACK University of Trento, Alessandro Marchetto Università di Trento
17:00
20m
Talk
Tracing Vulnerability Propagation Across Open Source Software Ecosystems
General Track
Jukka Ruohonen University of Southern Denmark, Qusai Ramadan The Maersk Mc-Kinney Moller Institute, University of Southern Denmark
17:20
20m
Talk
Localization Testing in Video Games using Text Recognition
General Track
Guillermo Jimenez-Diaz Universidad Complutense de Madrid, Dewei Chen Universidad Complutense de Madrid