ICSE 2024
Fri 12 - Sun 21 April 2024 Lisbon, Portugal
Wed 17 Apr 2024 11:00 - 11:15 at Amália Rodrigues - Evolution & AI Chair(s): Oscar Chaparro

Code clone detection (CCD) is of critical importance in software engineering, while semantic similarity is a key evaluation factor for CCD. The embedding technique, which represents an object using a numerical vector, is utilized to generate code representations, where code snippets with similar semantics (clone pairs) should have similar vectors. However, due to the diversity and flexibility of high-level program languages, the code representation of clone pairs may be inconsistent. Assembly code provides the program execution trace and can normalize the diversity of high-level languages in terms of the program behavior semantics. After revisiting the assembly language, we find that different assembly codes can align with the computational logic and memory access patterns of cloned pairs. Therefore, the use of multiple assembly languages can capture the behavior semantics to enhance the understanding of programs. Thus, we propose Prism, a new method for code clone detection fusing behavior semantics from multiple architecture assembly code, which directly captures multilingual domains’ syntax and semantic information. Additionally, we introduce a multi-feature fusion strategy that leverages global information interaction to expand the representation space. This fusion process allows us to capture the complementary information from each feature and leverage the relationships between them to create a more expressive representation of the code. After testing the OJClone dataset, the Prism model exhibited exceptional performance with precision and recall scores of 0.999 and 0.999, respectively. Additionally, behavior semantics is incorporated into the prior model, leading to improved clone detection performance.

Wed 17 Apr

Displayed time zone: Lisbon change

11:00 - 12:30
Evolution & AIResearch Track at Amália Rodrigues
Chair(s): Oscar Chaparro William & Mary
11:00
15m
Talk
Prism: Decomposing Program Semantics for Code Clone Detection through Compilation
Research Track
Haoran Li Nankai university, wangsiqian Nankai university, Weihong Quan Nankai university, Xiaoli Gong Nankai University, Huayou Su NUDT, Jin Zhang Hunan Normal University
11:15
15m
Talk
Evaluating Code Summarization Techniques: A New Metric and an Empirical Characterization
Research Track
Antonio Mastropaolo Università della Svizzera italiana, Matteo Ciniselli Università della Svizzera Italiana, Massimiliano Di Penta University of Sannio, Italy, Gabriele Bavota Software Institute @ Università della Svizzera Italiana
11:30
15m
Talk
Are Prompt Engineering and TODO Comments Friends or Foes? An Evaluation on GitHub Copilot
Research Track
David OBrien Iowa State University, Sumon Biswas Carnegie Mellon University, Sayem Mohammad Imtiaz Iowa State University, Rabe Abdalkareem Omar Al-Mukhtar University, Emad Shihab Concordia University, Hridesh Rajan Iowa State University
11:45
15m
Talk
Automatic Semantic Augmentation of Language Model Prompts (for Code Summarization)
Research Track
Toufique Ahmed University of California at Davis, Kunal Suresh Pai UC Davis, Prem Devanbu University of California at Davis, Earl T. Barr University College London
DOI Pre-print
12:00
15m
Talk
DSFM: Enhancing Functional Code Clone Detection with Deep Subtree Interactions
Research Track
Zhiwei Xu Tsinghua University, Shaohua Qiang Tsinghua University, Dinghong Song Tsinghua University, Min Zhou Tsinghua University, Hai Wan Tsinghua University, Xibin Zhao Tsinghua University, Ping Luo Tsinghua University, Hongyu Zhang Chongqing University
12:15
15m
Talk
Machine Learning is All You Need: A Simple Token-based Approach for Effective Code Clone Detection
Research Track
Siyue Feng Huazhong University of Science and Technology, Wenqi Suo Huazhong University of Science and Technology, Yueming Wu Nanyang Technological University, Deqing Zou Huazhong University of Science and Technology, Yang Liu Nanyang Technological University, Hai Jin Huazhong University of Science and Technology