Large Language Models and Simple, Stupid Bugs (MSR 2023 - Technical Papers)

Who

Kevin Jesse, Toufique Ahmed, Prem Devanbu, Emily Morgan

Track

MSR 2023 Technical Papers

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 16 May 2023 14:35 - 14:47 at Meeting Room 109 - Defect Prediction Chair(s): Sarra Habchi

Abstract

With the advent of powerful neural language models, AI-based systems to assist developers in coding tasks are becoming widely available; Copilot is one such system. Copilot uses Codex, a large language model (LLM), to complete code conditioned on a preceding “prompt”. Codex, however, is trained on public GitHub repositories, viz., on code that may include bugs and vulnerabilities. Previous studies [1], [2] show Codex reproduces vulnerabilities seen in training. In this study, we examine how prone Codex is to generate an interesting bug category, single statement bugs, commonly referred to as simple, stupid bugs or SStuBs in the MSR community. We find that Codex and similar LLMs do help avoid some SStuBs, but do produce known, verbatim SStuBs as much as 2x as likely than known, verbatim correct code. We explore the consequences of the Codex generated SStuBs and propose avoidance strategies that suggest the possibility of reducing the production of known, verbatim SStubs, and increase the possibility of producing known, verbatim fixes.

Link to Preprint

https://arxiv.org/pdf/2303.11455.pdf

Kevin Jesse

University of California at Davis, USA

United States

Toufique Ahmed

University of California at Davis

United States

Prem Devanbu

University of California at Davis

United States

Emily Morgan

University of California, Davis

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 16 May
Displayed time zone: Hobart change

14:35 - 15:15	Defect PredictionData and Tool Showcase Track / Technical Papers at Meeting Room 109 Chair(s): Sarra Habchi Ubisoft

14:35 12m Talk		Large Language Models and Simple, Stupid Bugs Technical Papers Kevin Jesse University of California at Davis, USA, Toufique Ahmed University of California at Davis, Prem Devanbu University of California at Davis, Emily Morgan University of California, Davis Pre-print
14:47 12m Talk		The ABLoTS Approach for Bug Localization: is it replicable and generalizable?Distinguished Paper Award Technical Papers Feifei Niu University of Ottawa, Christoph Mayr-Dorn JOHANNES KEPLER UNIVERSITY LINZ, Wesley Assunção Johannes Kepler University Linz, Austria & Pontifical Catholic University of Rio de Janeiro, Brazil, Liguo Huang Southern Methodist University, Jidong Ge Nanjing University, Bin Luo Nanjing University, Alexander Egyed Johannes Kepler University Linz Pre-print File Attached
14:59 6m Talk		LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations Data and Tool Showcase Track Catherine Tony Hamburg University of Technology, Markus Mutas Hamburg University of Technology, Nicolás E. Díaz Ferreyra Hamburg University of Technology, Riccardo Scandariato Hamburg University of Technology Pre-print
15:05 6m Talk		Defectors: A Large, Diverse Python Dataset for Defect Prediction Data and Tool Showcase Track Parvez Mahbub Dalhousie University, Ohiduzzaman Shuvo Dalhousie University, Masud Rahman Dalhousie University Pre-print