CSRS: Code Search with Relevance Matching and Semantic Matching (ICPC 2022 - Research)

Who

Yi Cheng, Li Kuang

Track

ICPC 2022 Research

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 17 May 2022 03:00 - 03:07 at ICPC room - Session 12: Search and Reuse: Code Chair(s): Fuxiang Chen

Abstract

Developers often search and reuse existing code snippets in the process of software development. Code search aims to retrieve relevant code snippets from a codebase according to natural language queries entered by the developer. Up to now, researchers have already proposed information retrieval (IR) based methods and deep learning (DL) based methods. The IR-based methods focus on lexical matching, that is to rank codes by relevance between queries and code snippets, while DL-based methods focus on capturing the semantic correlations. However, the existing methods rarely consider capturing two matching signals simultaneously. Therefore, in this paper, we propose CSRS, a code search model with relevance matching and semantic matching. CSRS comprises (1) an embedding module containing convolution kernels of different sizes which can extract n-gram embeddings of queries and codes, (2) a relevance matching module that measures lexical matching signals, and (3) a co-attention based semantic matching module to capture the semantic correlation. We train and evaluate CSRS on a dataset with 18.22M and 10k code snippets. The experimental results demonstrate that CSRS achieves an MRR of 0.614, which outperforms two state-of-the-art models DeepCS and CARLCS-CNN by 33.77% and 18.53% respectively. In addition, we also conducted several experiments to prove the effectiveness of each component of CSRS.

Yi Cheng

Central South University

Li Kuang

Central South University

Media

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 17 May
Displayed time zone: Eastern Time (US & Canada) change

03:00 - 03:40	Session 12: Search and Reuse: CodeResearch / Early Research Achievements (ERA) / Replications and Negative Results (RENE) at ICPC room Chair(s): Fuxiang Chen University of British Columbia

03:00 7m Talk		CSRS: Code Search with Relevance Matching and Semantic Matching Research Yi Cheng Central South University, Li Kuang Central South University Media Attached
03:07 4m Talk		Clone-based code method usage pattern mining Early Research Achievements (ERA) Zhipeng Xue National University of Defense Technology Media Attached
03:11 7m Talk		Towards Exploring the Code Reuse from Stack Overflow during Software Development Research Yuan Huang School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China, Furen Xu School of Software Engineering, Sun Yat-sen University, Zhuhai 519082, China, Haojie Zhou School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China, Xiangping Chen Guangdong Key Laboratory for Big Data Analysis and Simulation of Public Opinion, School of Communication and Design, Sun Yat-sen University, Guangzhou 510006, China., Xiaocong Zhou School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China, Tong Wang School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China Pre-print Media Attached
03:18 4m Talk		The Ineffectiveness of Domain-Specific Word Embedding Models for GUI Test Reuse Replications and Negative Results (RENE) Farideh Sadat Khalili Sharif University of Technology, Ali Mohebbi USI Lugano, Valerio Terragni University of Auckland, Mauro Pezze USI Lugano; Schaffhausen Institute of Technology, Leonardo Mariani University of Milano-Bicocca, Abbas Heydarnoori Sharif University of Technology Media Attached
03:22 18m Live Q&A		Q&A-Paper Session 12 Research

Information for Participants

Tue 17 May 2022 03:00 - 03:40 at ICPC room - Session 12: Search and Reuse: Code Chair(s): Fuxiang Chen

Info for room ICPC room:

Click here to go to the room on Midspace