Improving Code Search with Co-Attentive Representation Learning
Searching and reusing existing code from a large-scale codebase, e.g, GitHub, can help developers complete a programming task efficiently. Recently, Gu et al. proposed a deep learning-based model (i.e., DeepCS), which significantly outperformed prior models. The DeepCS embedded codebase and natural language queries into vectors by two LSTM (long and short-term memory) models separately, and returned developers the code with higher similarity to a code search query. However, such embedding method learned two isolated representations for code and query but ignored their internal semantic correlations. As a result, the learned isolated representations of code and query may limit the effectiveness of code search. To address the aforementioned issue, we propose a co-attentive representation learning model, i.e., Co-Attentive Representation Learning Code Search-CNN (CARLCS-CNN). CARLCS-CNN learns interdependent representations for the embedded code and query with a co-attention mechanism. Generally, such mechanism learns a correlation matrix between embedded code and query, and coattends their semantic relationship via row/column-wise max-pooling. In this way, the semantic correlation between code and query can directly affect their individual representations. We evaluate the effectiveness of CARLCS-CNN on Gu et al.’s dataset with 10k queries. Experimental results show that the proposed CARLCS-CNN model significantly outperforms DeepCS by 26.72% in terms of MRR (mean reciprocal rank). Additionally, CARLCS-CNN is five times faster than DeepCS in model training and four times in testing.
Tue 14 JulDisplayed time zone: (UTC) Coordinated Universal Time change
07:00 - 08:00 | Session 5: For ResearchersResearch / ERA / Tool Demonstration at ICPC Chair(s): Bin Lin Università della Svizzera italiana (USI) | ||
07:00 15mPaper | A Literature Review of Automatic Traceability Links Recovery for Software Change Impact Analysis Research Thazin Win Win Aung University of Technology Sydney, Yulei Sui University of Technology Sydney, Australia, Huan Huo University of Technology Sydney Media Attached | ||
07:15 15mPaper | Improving Code Search with Co-Attentive Representation Learning Research Jianhang Shuai School of Big Data & Software Engineering, Chongqing University, Ling Xu School of Big Data & Software Engineering, Chongqing University, Chao Liu Zhejiang University, Meng Yan School of Big Data & Software Engineering, Chongqing University, Xin Xia Monash University, Yan Lei School of Big Data & Software Engineering, Chongqing University Media Attached | ||
07:30 15mPaper | OpenSZZ: A Free, Open-Source, Web-Accessible Implementation of the SZZ Algorithm Tool Demonstration Valentina Lenarduzzi LUT University , Fabio Palomba University of Salerno, Davide Taibi Tampere University , Damian Andrew Tamburri Jheronimus Academy of Data Science Media Attached | ||
07:45 15mPaper | Staged Tree Matching for Detecting Code Move across Files ERA Akira Fujimoto Osaka University, Yoshiki Higo Osaka University, Junnosuke Matsumoto , Shinji Kusumoto Osaka University Media Attached |