HedgeCode: A Multi-Task Hedging Contrastive Learning Framework for Code Search
This program is tentative and subject to change.
Code search is a vital activity in software engineering, focused on identifying and retrieving the correct code snippets based on a query provided in natural language. Approaches based on deep learning techniques have been increasingly adopted for this task, enhancing the initial representations of both code and its natural language descriptions. Despite this progress, there remains an unexplored gap in ensuring consistency between the representation spaces of code and its descriptions. Furthermore, existing methods have not fully leveraged the potential relevance between code snippets and their descriptions, presenting a challenge in discerning fine-grained semantic distinctions among similar code snippets. To address these challenges, we introduce a multi-task hedging contrastive Learning framework for Code Search, referred to as HedgeCode. HedgeCode is structured around two primary training phases. The first phase, known as the representation alignment stage, proposes a hedging contrastive learning approach. This method aims to detect subtle differences between code and natural language text, thereby aligning their representation spaces by identifying relevance. The subsequent phase involves multi-task joint learning, wherein the previously trained model serves as the encoder. This stage optimizes the model through a combination of supervised and self-supervised contrastive learning tasks. Our framework’s effectiveness is demonstrated through its performance on the CodeSearchNet benchmark, showcasing HedgeCode’s ability to address the mentioned limitations in code search tasks.
This program is tentative and subject to change.
Thu 1 MayDisplayed time zone: Eastern Time (US & Canada) change
14:00 - 15:30 | |||
14:00 15mTalk | Code Comment Inconsistency Detection and Rectification Using a Large Language Model Research Track Guoping Rong Nanjing University, YongdaYu Nanjing University, Song Liu Nanjing University, Xin Tan Nanjing University, Tianyi Zhang Nanjing University, Haifeng Shen Southern Cross University, Jidong Hu Zhongxing Telecom Equipment | ||
14:15 15mTalk | Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation Research Track Aaron Imani University of California, Irvine, Iftekhar Ahmed University of California at Irvine, Mohammad Moshirpour University of California, Irvine | ||
14:30 15mTalk | HedgeCode: A Multi-Task Hedging Contrastive Learning Framework for Code Search Research Track Gong Chen Wuhan University, Xiaoyuan Xie Wuhan University, Xunzhu Tang University of Luxembourg, Qi Xin Wuhan University, Wenjie Liu Wuhan University | ||
14:45 15mTalk | Reasoning Runtime Behavior of a Program with LLM: How Far Are We? Research Track Junkai Chen Zhejiang University, Zhiyuan Pan Zhejiang University, Xing Hu Zhejiang University, Zhenhao Li York University, Ge Li Peking University, Xin Xia Huawei | ||
15:00 15mTalk | Source Code Summarization in the Era of Large Language Models Research Track Weisong Sun Nanjing University, Yun Miao Nanjing University, Yuekang Li UNSW, Hongyu Zhang Chongqing University, Chunrong Fang Nanjing University, Yi Liu Nanyang Technological University, Gelei Deng Nanyang Technological University, Yang Liu Nanyang Technological University, Zhenyu Chen Nanjing University | ||
15:15 15mTalk | Template-Guided Program Repair in the Era of Large Language Models Research Track Kai Huang , Jian Zhang Nanyang Technological University, Xiangxin Meng Beihang University, Beijing, China, Yang Liu Nanyang Technological University |