SEMANTIC CODE FINDER: An Efficient Semantic Search Framework for Large-Scale Codebases
This program is tentative and subject to change.
We present SEMANTIC CODE FINDER, a framework for semantic code search that delivers high-level search performance and supports multiple programming languages. Leveraging code summaries, it enables meaningful semantic code search by extracting the core content of code methods and using this information for search queries. Evaluated on large-scale codebases, SEMANTIC CODE FINDER demonstrates its effectiveness in outperforming existing open-source code search tools, achieving higher recall and precision rates. It delivers superior search performance across Java, Python, and C++. Notably, SEMANTIC CODE FINDER outperforms CodeMatcher, a previously successful semantic code search tool, by approximately 41% in terms of MRR. Moreover, it shows consistent performance across Java, Python, and C++ languages, highlighting its robustness and effectiveness. Currently, it is being used as a code search service for a significant amount of source code within Samsung Electronics, meeting the needs of its developers.
This program is tentative and subject to change.
Thu 1 MayDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:30 | |||
11:00 15mTalk | COCA: Generative Root Cause Analysis for Distributed Systems with Code Knowledge Research Track Yichen LI The Chinese University of Hong Kong, Yulun Wu The Chinese University of Hong Kong, Jinyang Liu Chinese University of Hong Kong, Zhihan Jiang The Chinese University of Hong Kong, Zhuangbin Chen Sun Yat-sen University, Guangba Yu Sun Yat-sen University, Michael Lyu The Chinese University of Hong Kong | ||
11:15 15mTalk | Enhancing Code Generation via Bidirectional Comment-Level Mutual Grounding Research Track | ||
11:30 15mTalk | HumanEvo: An Evolution-aware Benchmark for More Realistic Evaluation of Repository-level Code Generation Research Track Dewu Zheng Sun Yat-sen University, Yanlin Wang Sun Yat-sen University, Ensheng Shi Xi’an Jiaotong University, Ruikai Zhang Huawei Cloud Computing Technologies, Yuchi Ma Huawei Cloud Computing Technologies, Hongyu Zhang Chongqing University, Zibin Zheng Sun Yat-sen University | ||
11:45 15mTalk | SEMANTIC CODE FINDER: An Efficient Semantic Search Framework for Large-Scale Codebases SE In Practice (SEIP) daeha ryu Innovation Center, Samsung Electronics, Seokjun Ko Samsung Electronics Co., Eunbi Jang Innovation Center, Samsung Electronics, jinyoung park Innovation Center, Samsung Electronics, myunggwan kim Innovation Center, Samsung Electronics, changseo park Innovation Center, Samsung Electronics | ||
12:00 15mTalk | Time to Retrain? Detecting Concept Drifts in Machine Learning Systems SE In Practice (SEIP) Tri Minh-Triet Pham Concordia University, Karthikeyan Premkumar Ericsson, Mohamed Naili Ericsson, Jinqiu Yang Concordia University | ||
12:15 15mTalk | UML Sequence Diagram Generation: A Multi-Model, Multi-Domain Evaluation SE In Practice (SEIP) |