What Makes Good In-context Demonstrations for Code Intelligence Tasks with LLMs?
Pre-trained models of source code have gained widespread popularity in many code intelligence tasks. Recently, with the scaling of the model and corpus size, large language models have shown the ability of in-context learning (ICL). ICL employs task instructions and a few examples as demonstrations, and then inputs the demonstrations to the language models for making predictions. This new learning paradigm is training-free and has shown impressive performance in various natural language processing and code intelligence tasks. However, the performance of ICL heavily relies on the quality of demonstrations, e.g., the selected examples. It is important to systematically investigate how to construct a good demonstration for code-related tasks. In this paper, we empirically explore the impact of three key factors on the performance of ICL in code intelligence tasks: the selection, order, and number of demonstration examples. We conduct extensive experiments on three code intelligence tasks including code summarization, bug fixing, and program synthesis. Our experimental results demonstrate that all the above three factors dramatically impact the performance of ICL in code intelligence tasks. Additionally, we summarize our findings and provide takeaway suggestions on how to construct effective demonstrations, taking into account these three perspectives. We also show that a carefully-designed demonstration based on our findings can lead to substantial improvements over widely-used demonstration construction methods, e.g., improving BLEU-4, EM, and EM by at least 9.90%, 175.96%, and 50.81% on code summarization, bug fixing, and program synthesis, respectively.
Wed 13 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
13:30 - 15:00
|Delving into Commit-Issue Correlation to Enhance Commit Message Generation Models|
Liran Wang Beihang University, Xunzhu Tang University of Luxembourg, Yichen He Beihang University, Changyu Ren Beihang University, Shuhua Shi Beihang University, Chaoran Yan Beihang University, Zhoujun Li Beihang UniversityPre-print File Attached
|From Commit Message Generation to History-Aware Commit Message Completion|
Aleksandra Eliseeva JetBrains Research, Yaroslav Sokolov JetBrains, Egor Bogomolov JetBrains Research, Yaroslav Golubev JetBrains Research, Danny Dig JetBrains Research & University of Colorado Boulder, USA, Timofey Bryksin JetBrains ResearchPre-print File Attached
|Automatic Generation and Reuse of Precise Library Summaries for Object-Sensitive Pointer Analysis|
Jingbo Lu University of New South Wales, Dongjie He UNSW, Wei Li University of New South Wales, Yaoqing Gao Huawei Toronto Research Center, Jingling Xue UNSWPre-print File Attached
|What Makes Good In-context Demonstrations for Code Intelligence Tasks with LLMs?|
Shuzheng Gao The Chinese University of Hong Kong, Xin-Cheng Wen Harbin Institute of Technology, Cuiyun Gao Harbin Institute of Technology, Wenxuan Wang Chinese University of Hong Kong, Hongyu Zhang Chongqing University, Michael Lyu The Chinese University of Hong KongPre-print File Attached
|HexT5: Unified Pre-training for Stripped Binary Code Information InferenceRecorded talk|
Jiaqi Xiong University of Science and Technology of China, Guoqiang Chen University of Science and Technology of China, Kejiang Chen University of Science and Technology of China, Han Gao University of Science and Technology of China, Shaoyin Cheng University of Science and Technology of China, Weiming Zhang University of Science and Technology of ChinaMedia Attached File Attached
|Generating Variable Explanations via Zero-shot Prompt LearningRecorded talk|
Research PapersMedia Attached