Learning Heterogeneous Abstract Code Graph Representations For Program Comprehension (APSEC 2024 - Technical Track)

Who

Shenning Song, Mengxi Zhang, Shaoquan Li, huaxiao liu

Track

APSEC 2024 Technical Track

Time Zone

The program is currently displayed in (GMT+08:00) Beijing, Chongqing, Hong Kong, Urumqi.

Use conference time zone: (GMT+08:00) Beijing, Chongqing, Hong Kong, UrumqiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 5 Dec 2024 14:30 - 15:00 at Room 3 (Xiangquan Ballroom) - Session (10) Chair(s): In-Young Ko

Abstract

Program comprehension is a fundamental activity in the field of software engineering. However, efficiently and accurately understanding source code poses significant challenges, as source code with similar semantics can differ in syntax. Recent state-of-the-art research has demonstrated that combining deep learning techniques with structural information from source code, specifically AST-based static graphs, can enhance the extraction of essential features from source programs. Control flow and data flow information in source programs can express richer semantics while existing studies often overlook their heterogeneous integration when constructing program static graphs. This oversight results in the loss of information about the type of static graph edges, potentially impeding program comprehension.

In this paper, We model the source program by using a heterogeneous static graph and then use Relational Graph Convolutional Network (R-GCN) for feature extraction. Specifically, we present an innovative method for constructing a program static graph, termed the Heterogeneous Abstract Code Graph (HACG), and then we employ R-GCN to generate representations based on HACG for code classification and code clone detection. We evaluate our method using two extensive source code datasets: CodeNet, introduced by IBM, and BigCloneBench. The experimental results demonstrate the superiority of our approach over existing methods, achieving a code classification accuracy of 97.38% and an average F1-score of 98.34% in code clone detection.

Shenning Song

The College of Computer Science and Technology, Jilin University

China

Mengxi Zhang

The College of Computer Science and Technology, Jilin University

China

Shaoquan Li

The College of Computer Science and Technology, Jilin University

China

huaxiao liu

The College of Computer Science and Technology, Jilin University

China

Time Zone

The program is currently displayed in (GMT+08:00) Beijing, Chongqing, Hong Kong, Urumqi.

Use conference time zone: (GMT+08:00) Beijing, Chongqing, Hong Kong, UrumqiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 5 Dec
Displayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change

14:00 - 15:30	Session (10)Technical Track / SEIP - Software Engineering in Practice at Room 3 (Xiangquan Ballroom) Chair(s): In-Young Ko Korea Advanced Institute of Science and Technology

14:00 30m Talk		Why not Just Look For Answers? Using A More Direct Way for API Recommendation Technical Track Changxin Liu Chongqing University, Ling Xu School of Big Data & Software Engineering, Chongqing University, Wenhan Mu Chongqing University, Rui Qin Chongqing University
14:30 30m Talk		Learning Heterogeneous Abstract Code Graph Representations For Program Comprehension Technical Track Shenning Song The College of Computer Science and Technology, Jilin University, Mengxi Zhang The College of Computer Science and Technology, Jilin University, Shaoquan Li The College of Computer Science and Technology, Jilin University, huaxiao liu The College of Computer Science and Technology, Jilin University
15:00 20m Talk		CoSTV: Accelerating Code Search with Two-Stage Paradigm and Vector Retrieval SEIP - Software Engineering in Practice Dewu Zheng Sun yat-sen University, Yanlin Wang Sun Yat-sen University, Wenqing Chen Sun Yat-sen University, Jiachi Chen Sun Yat-sen University, Zibin Zheng Sun Yat-sen University