Retrieval-Based Prompt Selection for Code-Related Few-Shot Learning (ICSE 2023 - Technical Track)

Who

Noor Nashid, Mifta Sintaha, Ali Mesbah

Track

ICSE 2023 Technical Track

Time Zone

The program is currently displayed in (GMT+10:00) Hobart.

Use conference time zone: (GMT+10:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 19 May 2023 16:30 - 16:45 at Meeting Room 103 - Pre-trained and few shot learning for SE Chair(s): Yiling Lou

Abstract

Large language models trained on massive code corpora can generalize to new tasks without the need for task-specific fine-tuning. In few-shot learning, these models take as input a prompt, composed of natural language instructions, a few instances of task demonstration, and a query and generate an output. However, the creation of an effective prompt for code-related tasks in few-shot learning has received little attention. We present a retrieval-based technique for composing the ingredients of a prompt effectively. We apply our approach, CEDAR, to two different programming languages, statically and dynamically typed, and two different tasks, namely, assertion generation and program repair. For each task, we compare CEDAR with state-of-the-art task-specific and fine-tuned models. Our empirical results show that, with only a few code example demonstrations, our retrieval-based code demonstration selection is effective in both tasks, with an accuracy of 76% and 52% for exact matches in test assertion generation and program repair tasks, respectively. For assertion generation, CEDAR outperforms existing task-specific and fine-tuned models by 333% and 11%, respectively, and for program repair, CEDAR yields 189% better accuracy than task-specific models and is competitive with recent fine-tuned models. These findings have practical implications for practitioners, as CEDAR could potentially be applied to multilingual and multitask settings without task or language-specific training with minimal examples and effort.

Link to Preprint

https://people.ece.ubc.ca/amesbah/resources/papers/cedar-icse23.pdf

Noor Nashid

University of British Columbia

Canada

Mifta Sintaha

University of British Columbia

Ali Mesbah

University of British Columbia (UBC)