Benchmarking the Security Aspect of Large Language Model-Based Code Generation (LLM4Code 2024)

Who

Cheng Cheng, Jinqiu Yang

Track

LLM4Code 2024

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 20 Apr 2024 14:50 - 14:58 at Luis de Freitas Branco - Session 3: Keynote 2 + Position Papers Chair(s): Lingming Zhang

Abstract

Benchmark plays a pivotal role in advancing the research on the programming related tasks. In this study, we introduce, PyP4LLMSec, a Python benchmark designed to assess the security aspect of Python code generated by large language models (LLMs). Our methodology involves an analysis of Common Vulnerabilities and Exposures (CVEs) over the past two years. We identified 257 vulnerability-related commits associated with these CVEs across 143 open-source Python projects on GitHub. Subsequently, we conducted manual inspections of the vulnerable code, identifying and analyzing 295 code patches addressing vulnerabilities to generate Python code prompts at the file, class, and function granularity levels. As a result, we generated 2142 prompts with three distinct types of endings at various granularity levels, covering 15 different Common Weakness Enumeration (CWE) categories. To the best of our knowledge, this dataset represents the first collection of Python programming language prompts for scrutinizing the security of code generated by LLMs across different granularity levels. Our dataset, PyP4LLMSec, is publicly accessible on GitHub.

Link to Preprint

https://llm4code.github.io/assets/pdf/papers/42.pdf

Cheng Cheng

Concordia University

Jinqiu Yang