FSE 2025
Mon 23 - Fri 27 June 2025 Trondheim, Norway
co-located with ISSTA 2025
Mon 23 Jun 2025 14:40 - 15:00 at Aurora A - Code Search Chair(s): Xin Xia

Code search is a crucial task in software engineering, aiming to retrieve code snippets that are semantically relevant to a natural language query. Recently, Pre-trained Language Models (PLMs) have shown remarkable success and are widely adopted for code search tasks. However, PLM-based methods often struggle in cross-domain scenarios. When applied to a new domain, they typically require extensive fine-tuning with substantial data. Even worse, the data scarcity problem in new domains often forces these methods to operate in a zero-shot setting, resulting in a significant decline in performance. RAPID, which generates synthetic data for model fine-tuning, is currently the only effective method for zero-shot cross-domain code search. Despite its effectiveness, RAPID demands substantial computational resources for fine-tuning and needs to maintain specialized models for each domain, underscoring the need for a zero-shot, fine-tuning-free approach for cross-domain code search.

The key to tackling zero-shot cross-domain code search lies in bridging the gaps among domains. In this work, we propose to break the query-code matching process of code search into two simpler tasks: query-comment matching and code-code matching. We first conduct an empirical study to investigate the effectiveness of these two matching schemas in zero-shot cross-domain code search. Our findings highlight the strong complementarity among the three matching schemas, i.e., query-code, query-comment, and code-code matching. Based on the findings, we propose CodeBridge, a zero-shot, fine-tuning-free approach for cross-domain code search. Specifically, CodeBridge first employs zero-shot prompting to guide Large Language Models (LLMs) to generate a comment for each code snippet in the codebase and produce a code for each query. Subsequently, it encodes queries, code snippets, comments, and the generated code using PLMs and assesses similarities through three matching schemas: query-code, query-comment, and generated code-code. Lastly, CodeBridge leverages a sampling-based fusion approach that combines these three similarity scores to rank the final search outcomes. Experimental results show that our approach outperforms the state-of-the-art PLM-based code search approaches, i.e., CoCoSoDa and UniXcoder, by an average of 21.4% and 24.9% in MRR, respectively, across three datasets. Our approach also yields results that are better than or comparable to those of the zero-shot cross-domain code search approach RAPID, which requires fine-tuning.

Mon 23 Jun

Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

14:00 - 15:30
Code SearchResearch Papers / Journal First / Ideas, Visions and Reflections at Aurora A
Chair(s): Xin Xia Zhejiang University
14:00
20m
Talk
10 years later: revisiting how developers search for code
Research Papers
Kathryn Stolee North Carolina State University, Tobias Welp Google, Caitlin Sadowski , Sebastian Elbaum University of Virginia
DOI
14:20
20m
Talk
Approaching Code Search for Python as a Translation Retrieval Problem with Dual Encoders
Journal First
Monoshiz Mahbub Khan Rochester Institute of Technology, Zhe Yu Rochester Institute of Technology
14:40
20m
Talk
Zero-Shot Cross-Domain Code Search without Fine-Tuning
Research Papers
Keyu Liang Zhejiang University, Zhongxin Liu Zhejiang University, Chao Liu Chongqing University, Zhiyuan Wan Zhejiang University, David Lo Singapore Management University, Xiaohu Yang Zhejiang University
DOI
15:00
10m
Talk
Measuring What Matters: An Aggregate Metric for Assessing Enterprise Code Summaries
Ideas, Visions and Reflections
Ashita Saxena IBM Research, Palanivel Kodeswaran IBM Research India, Sayandeep Sen IBM Research India, Srikanth Tamilselvam IBM Research
15:10
20m
Talk
MiSum: Multi-Modality Heterogeneous Code Graph Learning for Multi-Intent Binary Code Summarization
Research Papers
Kangchen Zhu National university of Defense Technology, Zhiliang Tian National University of Defense Technology, Shangwen Wang National University of Defense Technology, Weiguo Chen National University of Defense Technology, Zixuan Dong National University of Defense Technology, mingyue leng National University of Defense Technology, Xiaoguang Mao National University of Defense Technology
DOI

Information for Participants
Mon 23 Jun 2025 14:00 - 15:30 at Aurora A - Code Search Chair(s): Xin Xia
Info for room Aurora A:

Aurora A is the first room in the Aurora wing.

When facing the main Cosmos Hall, access to the Aurora wing is on the right, close to the side entrance of the hotel.

:
:
:
: