Few-shot training LLMs for project-specific code-summarization (ASE 2022 - NIER Track) - ASE 2022

Write a Blog >>

Mon 10 - Fri 14 October 2022 Oakland Center, Michigan, United States

Who

Toufique Ahmed, Prem Devanbu

Track

ASE 2022 NIER Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Thu 13 Oct 2022 10:20 - 10:30 at Banquet A - Technical Session 22 - Code Summarization and Recommendation Chair(s): Houari Sahraoui

Abstract

Very large language models (LLMs), such as GPT-3 and Codex have achieved state-of-the-art performance on several natural-language tasks, and show great promise also for code. A particularly exciting aspect of LLMs is their knack for few-shot and zero-shot learning: they can learn to perform a task with very few examples. Few-shotting has particular synergies in software engineering, where there are a lot of phenomena (identifier names, APIs, terminology, coding patterns) that are known to be highly project-specific. However, project-specific data can be quite limited, especially early in the history of a project; thus the few-shot learning capacity of LLMs might be very relevant. In this paper, we investigate the use few-shot training with the very large GPT (Generative Pre-trained Transformer) Codex model, and find evidence suggesting that one can significantly surpass state-of-the-art models for code-summarization, leveraging project-specific training.

Link to Preprint

https://arxiv.org/abs/2207.04237

DOI

https://doi.org/10.1145/3551349.3559555

Toufique Ahmed

University of California at Davis

United States

Prem Devanbu

Department of Computer Science, University of California, Davis

United States

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Thu 13 Oct
Displayed time zone: Eastern Time (US & Canada) change

	10:00 - 12:00	Technical Session 22 - Code Summarization and RecommendationResearch Papers / NIER Track / Journal-first Papers / Industry Showcase at Banquet A Chair(s): Houari Sahraoui Université de Montréal

	10:00 20m Research paper		Identifying Solidity Smart Contract API Documentation Errors Research Papers Chenguang Zhu The University of Texas at Austin, Ye Liu Nanyang Technological University, Xiuheng Wu Nanyang Technological University, Singapore, Yi Li Nanyang Technological University Pre-print
	10:20 10m Vision and Emerging Results		Few-shot training LLMs for project-specific code-summarization NIER Track Toufique Ahmed University of California at Davis, Prem Devanbu Department of Computer Science, University of California, Davis DOI Pre-print
	10:30 20m Research paper		Answer Summarization for Technical Queries: Benchmark and New Approach Research Papers Chengran Yang Singapore Management University, Bowen Xu School of Information Systems, Singapore Management University, Ferdian Thung Singapore Management University, Yucen Shi Singapore Management University, Ting Zhang Singapore Management University, Zhou Yang Singapore Management University, Xin Zhou , Jieke Shi Singapore Management University, Junda He Singapore Management University, DongGyun Han Royal Holloway, University of London, David Lo Singapore Management University
	10:50 20m Paper		Code Structure Guided Transformer for Source Code SummarizationVirtual Journal-first Papers Shuzheng Gao Harbin Institute of Technology, Cuiyun Gao Harbin Institute of Technology, Yulan He University of Warwick, Jichuan Zeng The Chinese University of Hong Kong, Lun Yiu Nie Tsinghua University, Xin Xia Huawei Software Engineering Application Technology Lab, Michael Lyu The Chinese University of Hong Kong
	11:10 10m Vision and Emerging Results		Taming Multi-Output Recommenders for Software EngineeringVirtual NIER Track Christoph Treude University of Melbourne
	11:20 20m Industry talk		MV-HAN: A Hybrid Attentive Networks based Multi-View Learning Model for Large-scale Contents RecommendationVirtual Industry Showcase Ge Fan Tencent Inc., Chaoyun Zhang Tencent Inc., Kai Wang Tencent Inc., Junyang Chen Shenzhen University DOI Pre-print
	11:40 20m Research paper		Which Exception Shall We Throw?Virtual Research Papers Hao Zhong Shanghai Jiao Tong University

:

:

:

: