TCSE logo 
 Sigsoft logo
Sustainability badge

This program is tentative and subject to change.

Tue 29 Apr 2025 10:00 - 10:15 at 207 - Session 1 Chair(s): Qinghua Lu

We present a comparative analysis of open-source tools that scan conversational large language models (LLMs) for vulnerabilities, in short - \emph{scanners}. As LLMs become integral to various applications, they also present potential attack surfaces, exposed to security risks such as information leakage and jailbreak attacks. AI red-teaming, adapted from traditional cybersecurity, is recognized by governments and companies as essential - often emphasizing the challenge of continuously evolving threats. Our study evaluates prominent, cutting-edge scanners - Garak, Giskard, PyRIT, and CyberSecEval - that address this challenge by automating red-teaming processes. We detail the distinctive features and practical use of these scanners, outline unifying principles of their design and perform quantitative evaluations to compare them. These evaluations uncover significant reliability issues in detecting successful attacks, highlighting a fundamental gap for future development. Additionally, we contribute a foundational labeled dataset, which serves as an initial step to bridge this gap. Based on the above, we provide suggestions for future regulations and standardization, as well as strategic recommendations to assist organizations in scanner selection, considering customizability, test-suite comprehensiveness and industry-specific use cases.

This program is tentative and subject to change.

Tue 29 Apr

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:30
Session 1RAIE at 207
Chair(s): Qinghua Lu Data61, CSIRO
09:00
10m
Day opening
Opening Remarks
RAIE
Qinghua Lu Data61, CSIRO
09:10
50m
Keynote
Keynote 1 by Rick Kazman
RAIE
Rick Kazman University of Hawai‘i at Mānoa
10:00
15m
Talk
Insights and Current Gaps in Open-Source LLM Vulnerability Scanners: A Comparative Analysis
RAIE
Jonathan Brokman Fujitsu Research, Omer Hofman Fujitsu Research, Oren Rachmil Fujitsu Research, Inderjeet Singh Fujitsu Research, Vikas Pahuja Fujitsu Research, Aishvariya Priya Rathina Sabapathy Fujitsu Research, Amit Giloni Fujitsu Research, Roman Vainshtein Fujitsu Research, Hisashi Kojima Fujitsu Research
10:15
12m
Talk
Mitigating Values Debt in Generative AI: Responsible Engineering with Graph RAG
RAIE
Waqar Hussain Data61, CSIRO
:
:
:
: