ICST 2025
Mon 31 March - Fri 4 April 2025 Naples, Italy

This program is tentative and subject to change.

Wed 2 Apr 2025 11:45 - 12:00 at Aula Magna (AM) - LLMs in Testing Chair(s): Phil McMinn

Security vulnerabilities in modern software are prevalent and harmful. While automated vulnerability detection techniques have made promising progress, their scalability and applicability remain challenging. The remarkable performance of Large Language Models (LLMs), such as GPT-4 and CodeLlama, on code-related tasks has prompted recent works to explore if LLMs can be used to detect security vulnerabilities. In this paper, we perform a more comprehensive study by concurrently examining a higher number of datasets, languages and LLMs, and qualitatively evaluating detection performance across prompts and vulnerability classes while addressing the shortcomings of existing tools. Concretely, we evaluate the effectiveness of 16 pre-trained LLMs on 5,000 code samples—1,000 randomly selected each from five diverse security datasets. These balanced datasets encompass both synthetic and real-world projects in Java and C/C++ and cover 25 distinct vulnerability classes.

Overall, LLMs across all scales and families show modest effectiveness in end-to-end reasoning about vulnerabilities, obtaining an average accuracy of 62.8% and F1 score of 0.71 across all datasets. They are significantly better at detecting vulnerabilities that typically only need intra-procedural analysis, such as OS Command Injection and NULL Pointer Dereference. Moreover, they report higher accuracies on these vulnerabilities than popular static analysis tools, such as CodeQL.

We find that advanced prompting strategies that involve step-by-step analysis significantly improve performance of LLMs on real-world datasets in terms of F1 score (by upto 0.18 on average). Interestingly, we observe that LLMs show promising abilities at performing parts of the analysis correctly, such as identifying vulnerability-related specifications (e.g., sources and sinks) and leveraging natural language information to understand code behavior (e.g., to check if code is sanitized). We expect our insights to guide future work on LLM-augmented vulnerability detection systems.

This program is tentative and subject to change.

Wed 2 Apr

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:30
LLMs in TestingResearch Papers / Industry / Journal-First Papers at Aula Magna (AM)
Chair(s): Phil McMinn University of Sheffield
11:00
15m
Talk
in 6h 47 min AugmenTest: Enhancing Tests with LLM-driven Oracles
Research Papers
Shaker Mahmud Khandaker Fondazione Bruno Kessler, Fitsum Kifetew Fondazione Bruno Kessler, Davide Prandi Fondazione Bruno Kessler, Angelo Susi Fondazione Bruno Kessler
Pre-print
11:15
15m
Talk
Impact of Large Language Models of Code on Fault Localization
Research Papers
Suhwan Ji Yonsei University, Sanghwa Lee Kangwon National University, Changsup Lee Kangwon National University, Yo-Sub Han Yonsei University, Hyeonseung Im Kangwon National University, South Korea
11:30
15m
Talk
An Analysis of LLM Fine-Tuning and Few-Shot Learning for Flaky Test Detection and Classification
Research Papers
Riddhi More Ontario Tech University, Jeremy Bradbury Ontario Tech University
11:45
15m
Talk
Evaluating the Effectiveness of LLMs in Detecting Security Vulnerabilities
Research Papers
Avishree Khare , Saikat Dutta Cornell University, Ziyang Li University of Pennsylvania, Alaia Solko-Breslin University of Pennsylvania, Mayur Naik UPenn, Rajeev Alur University of Pennsylvania
12:00
15m
Talk
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categories and Test Code Repair
Journal-First Papers
Sakina Fatima University of Ottawa, Hadi Hemmati York University, Lionel Briand University of Ottawa, Canada; Lero centre, University of Limerick, Ireland
12:15
15m
Talk
Integrating LLM-based Text Generation with Dynamic Context Retrieval for GUI Testing
Industry
Juyeon Yoon Korea Advanced Institute of Science and Technology, Seah Kim Samsung Research, Somin Kim Korea Advanced Institute of Science and Technology, Sukchul Jung Samsung Research, Shin Yoo Korea Advanced Institute of Science and Technology
next event in 6h 47 min
:
:
:
: