Improving Code Search with Multi-Modal Momentum Contrastive Learning
Contrastive learning has recently been applied to enhancing the BERT-based pre-trained models for code search. However, the existing end-to-end training mechanism cannot sufficiently utilize the pre-trained models due to the limitations on the number and variety of negative samples. In this paper, we propose MoCoCS, a multi-modal momentum contrastive learning method for code search, to improve the representations of query and code by constructing large-scale multi-modal negative samples. MoCoCS increases the number and the variety of negative samples through two optimizations: integrating multi-batch negative samples and constructing multi-modal negative samples. We first build momentum contrasts for query and code, which enables the construction of large-scale negative samples out of a mini-batch. Then, to incorporate multi-modal code information, we build multi-modal momentum contrasts by encoding the abstract syntax tree and the data flow graph with a momentum encoder. Experiments on CodeSearchNet with six programming languages demonstrate that our method can further improve the effectiveness of pre-trained models for code search.
Tue 16 MayDisplayed time zone: Hobart change
13:45 - 15:15 | Programming Languages, Types, and ComplexityDiscussion / Research / Replications and Negative Results (RENE) / Journal First at Meeting Room 106 Chair(s): Vittoria Nardone | ||
13:45 9mFull-paper | How Well Static Type Checkers Work with Gradual Typing? A Case Study on Python Research Wenjie Xu Nanjing University, Lin Chen Nanjing University, Chenghao Su Nanjing University, Yimeng Guo Nanjing University, Yanhui Li Nanjing University, Yuming Zhou Nanjing University, Baowen Xu Nanjing University | ||
13:54 9mFull-paper | Too Simple? Notions of Task Complexity used in Maintenance-based Studies of Programming Tools Research Patrick Rein University of Potsdam; Hasso Plattner Institute, Tom Beckmann Hasso Plattner Institute, Eva Krebs Hasso Plattner Institute (HPI), University of Potsdam, Germany, Toni Mattis University of Potsdam; Hasso Plattner Institute, Robert Hirschfeld University of Potsdam; Hasso Plattner Institute | ||
14:03 9mFull-paper | Path Complexity Predicts Code Comprehension Effort Research Sofiane Dissem Harvey Mudd College, Eli Pregerson Harvey Mudd College, Adi Bhargava Harvey Mudd College, Josh Cordova Harvey Mudd College, Lucas Bang Harvey Mudd College | ||
14:12 5mShort-paper | Revisiting Deep Learning for Variable Type Recovery Replications and Negative Results (RENE) Pre-print | ||
14:17 9mTalk | Programming language implementations for context-oriented self-adaptive systems Journal First Nicolás Cardozo Universidad de los Andes, Kim Mens Université catholique de Louvain, ICTEAM institute, Belgium Link to publication DOI Media Attached | ||
14:26 9mFull-paper | Improving Code Search with Multi-Modal Momentum Contrastive Learning Research Zejian Shi Fudan University, Yun Xiong Fudan University, Yao Zhang Fudan University, Zhijie Jiang National University of Defense Technology, Jinjing Zhao National Key Laboratory of Science and Technology on Information System Security, Lei Wang National University of Defense Technology, Shanshan Li National University of Defense Technology Pre-print | ||
14:35 9mFull-paper | Revisiting Lightweight Compiler Provenance Recovery on ARM Binaries Replications and Negative Results (RENE) Pre-print | ||
14:44 31mPanel | Discussion 7 Discussion |