Code Retrieval with Mixture of Experts Prototype Learning Based on Classification
The semantic connection between code and queries is crucial for code retrieval, but many human-written queries fail to accurately capture the code’s core intent, leading to ambiguity. This ambiguity complicates the code search process, as the queries do not provide a clear overview of the code’s purpose. Our analysis reveals that while ambiguous queries may not precisely summarize the intent of the code, they often share the same general topics as the corresponding code. In light of this discovery, we propose \underline{C}ode \underline{R}etrieval with \underline{M}ixture of \underline{E}xperts Prototype Learning Based on Classification (CRME), a novel approach that combines classification for prototype-based representation learning and result ensembling. CRME utilizes specialized pre-trained models focused on the specific domains of ambiguous queries. It consists of two key components: Multiple Classification Prototype and Representation Learning with a Prototype-based Multi-model Contrastive (PMC) Loss during training, and Multi-Prototype Mixture of Experts Integration (MP-MoE) module for fine-grained ensemble inference. Our method can effectively address the issue of query ambiguity and improves search precision. Experimental results on the CodeSearchNet dataset, covering six sub-datasets, show that CRME outperforms existing methods, achieving an average MRR score of 81.4%. When applied to pre-trained models like CodeBERT, GraphCodeBERT, UniXcoder and CodeT5+, CRME can effectively boosts their performances.
Fri 20 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
14:00 - 15:30 | Session2: AI for Software Engineering IResearch Track at Cosmos 3A Chair(s): Jialun Cao Hong Kong University of Science and Technology | ||
14:00 15mTalk | Code Retrieval with Mixture of Experts Prototype Learning Based on Classification Research Track Feng Ling School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China, Guoheng Huang School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China, Jingchao Wang School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China, Xiaochen Yuan Faculty of Applied Sciences, Macau Polytechnic University, Macau, China, Xuhang Chen School of Computer Science and Engineering, Huizhou University, Huizhou 516001, China, XueYong Zhang School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China, Fanlong Zhang School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China, Chi-Man Pun Department of Computer and Information Science, University of Macau, Macau, China | ||
14:15 15mTalk | Issue Retrieval and Verification Enhanced Supplementary Code Comment Generation Research Track Yanzhen Zou Peking University, Xianlin Zhao Peking University, Xinglu Pan Peking University, Bing Xie Peking University Pre-print | ||
14:30 15mTalk | CodeCleaner: Mitigating Data Contamination for LLM Benchmarking Research Track Jialun Cao Hong Kong University of Science and Technology, Songqiang Chen The Hong Kong University of Science and Technology, Wuqi Zhang MegaETH, Hau Ching Lo The Hong Kong University of Science and Technology, Yeting Li Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Shing-Chi Cheung Hong Kong University of Science and Technology Pre-print Media Attached | ||
14:45 15mTalk | LASER:Script Execution by Autonomous Agents for On-demand Traffic Simulation Research Track Hao Gao Nanjing University, Jingyue Wang Nanjing University, Wenyang Fang Nanjing University, Jingwei Xu , Yunpeng Huang Nanjing University, Taolue Chen Birkbeck, University of London, Xiaoxing Ma Nanjing University Pre-print | ||
15:00 15mTalk | Tech-ASan: Two-stage check for Address Sanitizer Research Track Yixuan Cao ShenZhen University, Yuhong Feng Shenzhen University, Huafeng Li Shenzhen University, Chongyi Huang Shenzhen University, Fangcao Jian Shenzhen University, Haoran Li Shenzhen University, Xu Wang Shenzhen University Pre-print Media Attached |
Cosmos 3A is the first room in the Cosmos 3 wing.
When facing the main Cosmos Hall, access to the Cosmos 3 wing is on the left, close to the stairs. The area is accessed through a large door with the number “3”, which will stay open during the event.