Code Search is All You Need? Improving Code Suggestions with Code Search
Modern integrated development environments (IDEs) provide various automated code suggestion techniques (e.g., code completion and code generation) to help developers improve their efficiency. Such techniques may retrieve similar code snippets from the code base or leverage deep learning models to provide code suggestions. However, how to effectively enhance the code suggestions using code retrieval has not been systematically investigated. In this paper, we study and explore a retrieval-augmented framework for code suggestions. Specifically, our framework leverages different retrieval approaches and search strategies to search similar code snippets. Then the retrieved code is used to further enhance the performance of language models on code suggestions. We conduct experiments by integrating different language models into our framework and compare the results with their original models. We find that our framework noticeably improves the performance of both code completion and code generation by up to 38.5% and 130.8% in terms of accuracy and BLEU-4, respectively. Our study highlights that integrating the retrieval process into code suggestions can improve the performance of code suggestions by a large margin.
Wed 17 AprDisplayed time zone: Lisbon change
11:00 - 12:30 | Language Models and Generated Code 1Research Track / New Ideas and Emerging Results at Maria Helena Vieira da Silva Chair(s): Yiling Lou Fudan University | ||
11:00 15mTalk | Modularizing while Training: a New Paradigm for Modularizing DNN Models Research Track Binhang Qi Beihang University, Hailong Sun Beihang University, Hongyu Zhang Chongqing University, Ruobing Zhao Beihang University, Xiang Gao Beihang University Pre-print | ||
11:15 15mResearch paper | KnowLog: Knowledge Enhanced Pre-trained Language Model for Log Understanding Research Track Lipeng Ma Fudan University, Weidong Yang Fudan University, Bo Xu Donghua University, Sihang Jiang Fudan University, Ben Fei Fudan University, Jiaqing Liang Fudan University, Mingjie Zhou Fudan University, Yanghua Xiao Fudan University | ||
11:30 15mTalk | FAIR: Flow Type-Aware Pre-Training of Compiler Intermediate Representations Research Track Changan Niu Software Institute, Nanjing University, Chuanyi Li Nanjing University, Vincent Ng Human Language Technology Research Institute, University of Texas at Dallas, Richardson, TX 75083-0688, David Lo Singapore Management University, Bin Luo Nanjing University Pre-print | ||
11:45 15mTalk | Unveiling Memorization in Code Models Research Track Zhou Yang Singapore Management University, Zhipeng Zhao Singapore Management University, Chenyu Wang Singapore Management University, Jieke Shi Singapore Management University, Dongsun Kim Kyungpook National University, DongGyun Han Royal Holloway, University of London, David Lo Singapore Management University | ||
12:00 15mTalk | Code Search is All You Need? Improving Code Suggestions with Code Search Research Track Junkai Chen Zhejiang University, Xing Hu Zhejiang University, Zhenhao Li Concordia University, Cuiyun Gao Harbin Institute of Technology, Xin Xia Huawei Technologies, David Lo Singapore Management University | ||
12:15 7mTalk | Expert Monitoring: Human-Centered Concept Drift Detection in Machine Learning Operations New Ideas and Emerging Results Joran Leest Vrije Universiteit Amsterdam, Claudia Raibulet Vrije Universiteit Amsterdam, Ilias Gerostathopoulos Vrije Universiteit Amsterdam, Patricia Lago Vrije Universiteit Amsterdam Pre-print |