Adaptive Deep Code Search (ICPC 2020 - Research)

Who

ChunYang Ling, Zeqi Lin, Yanzhen Zou, Bing Xie

Track

ICPC 2020 Research

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 15 Jul 2020 08:50 - 09:10 at ICPC - Session 11: Search Chair(s): Banani Roy

Abstract

Searching code in a large-scale codebase using natural language queries is a common practice during software development. Deep learning-based code search methods demonstrate superior performance if models are trained with large amount of text-code pairs. However, few deep code search models can be easily transferred from one codebase to another. It can be very costly to prepare training data for a new codebase and re-train an appropriate deep learning model. In this paper, we propose AdaCS, an adaptive deep code search method that can be trained once and transferred to new codebases. AdaCS decomposes the learning process into embedding domain-specific words and matching general syntactic patterns. Firstly, an unsupervised word embedding technique is used to construct a matching matrix to represent the lexical similarities. Then, a recurrent neural network is used to capture latent syntactic patterns from these matching matrices in a supervised way. As the supervised task learns general syntactic patterns that exist across domains, AdaCS is transferable to new codebases. Experimental results show that: when extended to new software projects never seen in the training data, AdaCS is more robust and significantly outperforms state-of-the-art deep code search methods.

Link to Preprint

https://laurence-ling.github.io/paper/ICPC20-AdaCS-perprint.pdf

ChunYang Ling

Peking University

China

Zeqi Lin

Peking University

China

Yanzhen Zou

Peking University

Bing Xie

Peking University

Media

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 15 Jul
Displayed time zone: (UTC) Coordinated Universal Time change

08:30 - 09:30	Session 11: SearchResearch at ICPC Chair(s): Banani Roy University of Saskatchewan

08:30 20m Paper		GGF: A Graph-based Method for Programming Language Syntax Error Correction Research Liwei Wu Nanjing University, Fei Li Nanjing University, Youhua Wu Nanjing University, Tao Zheng Nanjing University Media Attached
08:50 20m Paper		Adaptive Deep Code Search Research ChunYang Ling Peking University, Zeqi Lin Peking University, Yanzhen Zou Peking University, Bing Xie Peking University Pre-print Media Attached
09:10 20m Paper		Duplicate Bug Report Detection Using Dual-Channel Convolutional Neural Networks Research Jianjun He School of Big Data & Software Engineering, Chongqing University, Ling Xu School of Big Data & Software Engineering, Chongqing University, Meng Yan School of Big Data & Software Engineering, Chongqing University, Xin Xia Monash University, Yan Lei School of Big Data & Software Engineering, Chongqing University Media Attached