Q&AEval: Benchmarking Secure Coding Ability of LLMs on Real-World Tasks
Conversational Models revolutionize the way we think, communicate, and code. Large Language Models (LLMs) such as GPT-o4 can generate thousands of lines of code in seconds, ranging from simple boilerplate functions to large and complex applications. In this study, we evaluate the security and quality of the code produced by LLMs, comparing it to a human baseline derived from a vast corpus of StackOverflow questions and answers.
We queried 5 LLMs with over 10,000 cybersecurity-related questions from StackOverflow. Using three static code scanners, we automatically identified software vulnerabilities in the AI-generated code for Java and Python as well as the human-provided code snippets in the StackOverflow answers. Based on this data, we analyze what developers can expect from LLM-generated code and how the security level compares to humans.
We find that popular LLMs generate code that is less secure than code written by human programmers. LLMs often replicate common vulnerability patterns and, in some cases, introduce additional security issues. Our results contradict a previous study on a similar, albeit smaller dataset.
Sat 18 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
16:00 - 17:30 | |||
16:00 20mTalk | LLMs in Code Vulnerability Analysis: A Proof of Concept SVM Shaznin Sultana Ohio University, Sadia Afreen University of Cincinnati, Nasir Eisty University of Tennessee-Knoxville | ||
16:20 20mTalk | Q&AEval: Benchmarking Secure Coding Ability of LLMs on Real-World Tasks SVM Markus Toran Fraunhofer SIT; ATHENE, Bettina Ballin , Marc Miltenberger Fraunhofer SIT; ATHENE, Steven Arzt Fraunhofer SIT; ATHENE | ||
16:40 20mTalk | Process-based Indicators of Vulnerability Re-Introducing Code Changes: An Exploratory Case Study SVM Samiha Shimmi Northern Illinois University, Nicholas Synovic Loyola University Chicago, Mona Rahimi Northern Illinois University, George K. Thiruvathukal Loyola University Chicago | ||
17:00 5mDay closing | SVM Closure SVM Triet Le Adelaide University | ||