SICode: Embedding-Based Subgraph Isomorphism Identification for Bug DetectionICPCICPC Full paper
Given a known buggy code snippet, searching for similar patterns in a target project to detect unknown bugs is a reasonable approach. In practice, a search unit, such as a function, may appear quite different from the buggy snippet but actually contains a similar buggy substructure. Utilizing subgraph isomorphism identification can effectively hunt potential bugs by checking whether an approximate copy of the buggy subgraph exists within the target code graphs. Regrettably, subgraph isomorphism identification is an NP-complete problem. In this paper, we propose an embedding-based method, SICode, to efficiently perform subgraph isomorphism identification for code graphs. We train a graph embedding model and the subgraph isomorphism relationship between two graphs can be measured by comparing their embedding vectors. In this manner, we can efficiently identify potential buggy code graphs via vector arithmetic without solving an NP-complete problem. A cascading loss scheme is presented to ensure the identification performance. SICode exhibits greater scalability than classic subgraph isomorphism algorithms, such as VF2, and maintains high precision and recall. Experiments also demonstrate that SICode offers advantages in detecting sub-structurally similar bugs. Our approach spotted dozens of unknown bugs in real-world projects, which have been confirmed by their developers. Among them, 18 bugs are ranked within the top ten results of retrieval. This result is very encouraging for detecting subtle sub-structurally similar bugs.
Tue 16 AprDisplayed time zone: Lisbon change
11:00 - 12:30 | Bugs, Defects, and Code QualityResearch Track / / Early Research Achievements (ERA) / Replications and Negative Results (RENE) at Sophia de Mello Breyner Andresen Chair(s): Alberto Martin-Lopez Software Institute - USI, Lugano | ||
11:00 10mTalk | What the Fix? A Study of ASAT Rules DocumentationICPCICPC Full paper Research Track Corentin Latappy Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, Promyze, Thomas Degueule CNRS, Jean-Rémy Falleri Bordeaux INP, Romain Robbes CNRS, LaBRI, University of Bordeaux, Xavier Blanc Univ. Bordeaux, Bordeaux INP, CNRS, LaBRI, UMR5800, Cédric Teyton Promyze, Bordeaux, France Pre-print | ||
11:10 10mTalk | SolaSim: Clone Detection for Solana Smart Contracts via Program RepresentationICPCICPC Full paper Research Track Che Wang Peking University, China, Yue Li Peking University, Jianbo Gao Peking University, Ke Wang Peking University, Jiashuo Zhang Peking University, China, Zhi Guan Peking University, Zhong Chen | ||
11:20 10mTalk | The Impact of Compiler Warnings on Code Quality in C++ ProjectsICPCICPC Full paper Research Track Albin Johansson Chalmers University of Technology, Carl Holmberg Chalmers University of Technology, Francisco Gomes de Oliveira Neto Chalmers | University of Gothenburg, Philipp Leitner Chalmers | University of Gothenburg | ||
11:30 10mTalk | Vulnerabilities in AI Code Generators: Exploring Targeted Data Poisoning AttacksICPCICPC Full paper Research Track Domenico Cotroneo University of Naples Federico II, Cristina Improta University of Naples Federico II, Pietro Liguori University of Naples Federico II, Roberto Natella Federico II University of Naples Pre-print | ||
11:40 10mTalk | A Just-in-time Software Defect Localization Method based on Code Graph RepresentationICPCICPC Full paperVirtual-Talk Research Track Huan Zhang Central South University, Wei-Huan Min Central South University, Zhao Wei Tencent, Li Kuang School of Computer Science and Engineering, Central South University, Hong-Hao Gao Shanghai University, Huai-Kou Miao Shanghai University | ||
11:50 10mTalk | SICode: Embedding-Based Subgraph Isomorphism Identification for Bug DetectionICPCICPC Full paper Research Track Yuanjun Gong Renmin University of China, Jianglei Nie Renmin University of China, Wei You Renmin University of China, Wenchang Shi Renmin University of China, China, Jianjun Huang Renmin University of China, Bin Liang Renmin University of China, China, Jian Zhang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences | ||
12:00 10mTalk | Tuning Code Smell Prediction Models: A Replication StudyICPCICPC RENE Paper Replications and Negative Results (RENE) Henrique Gomes Nunes Federal University of Minas Gerais (UFMG), Amanda Santana Federal University of Minas Gerais (UFMG), Eduardo Figueiredo Federal University of Minas Gerais, Brazil, Heitor Augustus Xavier Costa Federal University of Lavras | ||
12:10 8mTalk | Studying Vulnerable Code Entities in RICPCICPC ERA Paper Early Research Achievements (ERA) Zixiao Zhao University of British Columbia, Millon Madhur Das Indian Institute of Technology Kharagpur, Fatemeh Hendijani Fard University of British Columbia | ||
12:18 12mTalk | Bugs, Defects, and Code Quality: Panel with SpeakersICPC Discussion |