Write a Blog >>
ICSE 2023
Sun 14 - Sat 20 May 2023 Melbourne, Australia
Wed 17 May 2023 11:52 - 12:00 at Level G - Plenary Room 1 - AI models for SE Chair(s): Denys Poshyvanyk

Code embeddings have seen increasing applications in software engineering (SE) research and practice recently. Despite the advances in embedding techniques applied in SE research, one of the main challenges is their generalizability. A recent study finds that code embeddings may not be readily leveraged for the downstream tasks that the embeddings are not particularly trained for. Therefore, in this paper, we propose GraphCodeVec, which represents the source code as graphs and leverages the Graph Convolutional Networks to learn a more generalizable code embeddings in a task-agnostic manner. The edges in the graph representation are automatically constructed from the paths in the abstract syntax trees, and the nodes from the tokens in the source code. To evaluate the effectiveness of GraphCodeVec, we consider three downstream benchmark tasks (i.e., code comment generation, code authorship identification, and code clones detection) that are used in a prior benchmarking of code embeddings and add three new downstream tasks (i.e., source code classification, logging statements prediction, and software defect prediction), resulting in a total of six downstream tasks that are considered in our evaluation. For each downstream task, we apply the embeddings learned by GraphCodeVec and the embeddings learned from four baseline approaches and compare their respective performance. We find that GraphCodeVec outperforms all the baselines in five out of the six downstream tasks and its performance is relatively stable across different tasks and datasets. In addition, we perform ablation experiments to understand the impacts of the training context (i.e., the graph context extracted from the abstract syntax trees) and the training model (i.e., the Graph Convolutional Networks) on the effectiveness of the generated embeddings. The results show that both the graph context and the Graph Convolutional Networks can benefit GraphCodeVec in producing high-quality embeddings for the downstream tasks, while the improvement by Graph Convolutional Networks is more robust across different downstream tasks and datasets. Our findings suggest that future research and practice may consider using graph-based deep learning methods to capture the structural information of the source code for SE tasks.

Wed 17 May

Displayed time zone: Hobart change

11:00 - 12:30
11:00
15m
Talk
One Adapter for All Programming Languages? Adapter Tuning for Multilingual Tasks in Software Engineering
Technical Track
Deze Wang National University of Defense Technology, Boxing Chen , Shanshan Li National University of Defense Technology, Wei Luo , Shaoliang Peng Hunan University, Wei Dong School of Computer, National University of Defense Technology, China, Liao Xiangke National University of Defense Technology
11:15
15m
Talk
CCRep: Learning Code Change Representations via Pre-Trained Code Model and Query Back
Technical Track
Zhongxin Liu Zhejiang University, Zhijie Tang Zhejiang University, Xin Xia Huawei, Xiaohu Yang Zhejiang University
Pre-print
11:30
15m
Talk
Keeping Pace with Ever-Increasing Data: Towards Continual Learning of Code Intelligence Models
Technical Track
Shuzheng Gao Harbin institute of technology, Hongyu Zhang The University of Newcastle, Cuiyun Gao Harbin Institute of Technology, Chaozheng Wang Harbin Institute of Technology
11:45
7m
Talk
PCR-Chain: Partial Code Reuse Assisted by Hierarchical Chaining of Prompts on Frozen Copilot
DEMO - Demonstrations
Qing Huang School of Computer Information Engineering, Jiangxi Normal University, Jiahui Zhu School of Computer Information Engineering, Jiangxi Normal University, Zhilong Li School of Computer Information Engineering, Jiangxi Normal University, Zhenchang Xing , Changjing Wang School of Computer Information Engineering, Jiangxi Normal University, Xiwei (Sherry) Xu CSIRO’s Data61
11:52
7m
Talk
Towards Learning Generalizable Code Embeddings using Task-agnostic Graph Convolutional Networks
Journal-First Papers
Zishuo Ding Concordia University, Heng Li Polytechnique Montréal, Weiyi Shang University of Waterloo, Tse-Hsun (Peter) Chen Concordia University
12:00
7m
Talk
deGraphCS: Embedding Variable-based Flow Graph for Neural Code Search
Journal-First Papers
Chen Zeng National University of Defense Technology, Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China, Shanshan Li National University of Defense Technology, Xin Xia Huawei, Wang Zhiming National University of Defense Technology, Mingyang Geng National University of Defense Technology, Linxiao Bai National University of Defense Technology, Wei Dong School of Computer, National University of Defense Technology, China, Liao Xiangke National University of Defense Technology
12:07
7m
Talk
CodeS: Towards Code Model Generalization Under Distribution Shift
NIER - New Ideas and Emerging Results
Qiang Hu University of Luxembourg, Yuejun GUo University of Luxembourg, Xiaofei Xie Singapore Management University, Maxime Cordy University of Luxembourg, Luxembourg, Lei Ma University of Alberta, Mike Papadakis University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, Luxembourg
12:15
7m
Talk
Towards using Few-Shot Prompt Learning for Automating Model Completion
NIER - New Ideas and Emerging Results
Meriem Ben Chaaben Université de Montréal, DIRO, Lola Burgueño University of Malaga, Houari Sahraoui Université de Montréal