deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search
With the rapid increase of public code repositories, developers maintain a great desire to retrieve precise code snippets by using natural language. Despite existing deep learning-based approaches provide end-to-end solutions (i.e., accept natural language as queries and show related code fragments), the performance of code search in the large-scale repositories is still low in accuracy because of the code representation (e.g., AST) and modeling (e.g., directly fusing features in the attention stage). In this paper, we propose a novel learnable deep Graph for Code Search (called deGraphCS) to transfer source code into variable-based low graphs based on an intermediate representation technique, which can model code semantics more precisely than directly processing the code as text or using the syntax tree representation. Furthermore, we propose a graph optimization mechanism to refine the code representation and apply an improved gated graph neural network to model variable-based low graphs. To evaluate the effectiveness of deGraphCS, we collect a large-scale dataset from GitHub containing 41,152 code snippets written in the C language and reproduce several typical deep code search methods for comparison. The experimental results show that deGraphCS can achieve state-of-the-art performance and accurately retrieve code snippets satisfying the needs of the users.
Wed 17 MayDisplayed time zone: Hobart change
| 11:00 - 12:30 | AI models for SEJournal-First Papers / Technical Track / DEMO - Demonstrations / NIER - New Ideas and Emerging Results at Level G - Plenary Room 1 Chair(s): Denys Poshyvanyk College of William and Mary | ||
| 11:0015m Talk | One Adapter for All Programming Languages? Adapter Tuning for Multilingual Tasks in Software Engineering Technical Track Deze Wang National University of Defense Technology, Boxing Chen , Shanshan Li National University of Defense Technology, Wei Luo , Shaoliang Peng Hunan University, Wei Dong School of Computer, National University of Defense Technology, China, Liao Xiangke National University of Defense Technology | ||
| 11:1515m Talk | CCRep: Learning Code Change Representations via Pre-Trained Code Model and Query Back Technical Track Zhongxin Liu Zhejiang University, Zhijie Tang Zhejiang University, Xin Xia Huawei, Xiaohu Yang Zhejiang UniversityPre-print | ||
| 11:3015m Talk | Keeping Pace with Ever-Increasing Data: Towards Continual Learning of Code Intelligence Models Technical Track Shuzheng Gao Harbin institute of technology, Hongyu Zhang The University of Newcastle, Cuiyun Gao Harbin Institute of Technology, Chaozheng Wang Harbin Institute of Technology | ||
| 11:457m Talk | PCR-Chain: Partial Code Reuse Assisted by Hierarchical Chaining of Prompts on Frozen Copilot DEMO - Demonstrations Qing Huang School of Computer Information Engineering, Jiangxi Normal University, Jiahui Zhu School of Computer Information Engineering, Jiangxi Normal University, Zhilong Li School of Computer Information Engineering, Jiangxi Normal University, Zhenchang Xing , Changjing Wang School of Computer Information Engineering, Jiangxi Normal University, Xiwei (Sherry) Xu CSIRO’s Data61 | ||
| 11:527m Talk | Towards Learning Generalizable Code Embeddings using Task-agnostic Graph Convolutional Networks Journal-First Papers Zishuo Ding Concordia University, Heng Li Polytechnique Montréal, Weiyi Shang University of Waterloo, Tse-Hsun (Peter) Chen Concordia University | ||
| 12:007m Talk | deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search Journal-First Papers Chen Zeng National University of Defense Technology, Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China, Shanshan Li National University of Defense Technology, Xin Xia Huawei, Wang Zhiming National University of Defense Technology, Mingyang Geng National University of Defense Technology, Linxiao Bai National University of Defense Technology, Wei Dong School of Computer, National University of Defense Technology, China, Liao Xiangke National University of Defense Technology | ||
| 12:077m Talk | CodeS: Towards Code Model Generalization Under Distribution Shift NIER - New Ideas and Emerging Results Qiang Hu University of Luxembourg, Yuejun GUo University of Luxembourg, Xiaofei Xie Singapore Management University, Maxime Cordy University of Luxembourg, Luxembourg, Lei Ma University of Alberta, Mike Papadakis University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg | ||
| 12:157m Talk | Towards using Few-Shot Prompt Learning for Automating Model Completion NIER - New Ideas and Emerging Results Meriem Ben Chaaben Université de Montréal, DIRO, Lola Burgueño University of Malaga, Houari Sahraoui Université de Montréal | ||


