Write a Blog >>
ICSE 2022
Sun 8 - Fri 27 May 2022
Wed 11 May 2022 05:15 - 05:20 at ICSE room 2-odd hours - Program Comprehension 1 Chair(s): Prajish Prasad
Wed 11 May 2022 21:20 - 21:25 at ICSE room 1-odd hours - Program Comprehension 3 Chair(s): Christina von Flach

With the great success of pre-trained models, the pretrain-then-finetune paradigm has been widely adopted on downstream tasks for source code understanding. However, compared to costly training a large-scale model from scratch, how to effectively adapt pre-trained models to a new task has not been fully explored. In this paper, we propose an approach to bridge pre-trained models and code-related tasks. We exploit semantic-preserving transformation to enrich downstream data diversity, and help pre-trained models learn semantic features that are invariant to these semantically equivalent transformations. Further, we introduce curriculum learning to organize the transformed data in an easy-to-hard manner to fine-tune existing pre-trained models.

We apply our approach to a range of pre-trained models, and they significantly outperform the state-of-the-art models on tasks for source code understanding, such as algorithm classification, code clone detection, and code search. Our experiments even show that without heavy pre-training on code data, natural language pre-trained model RoBERTa fine-tuned with our lightweight approach could outperform or rival existing code pre-trained models fine-tuned on the above tasks, such as CodeBERT and GraphCodeBERT. This finding suggests that there is still much room for improvement in code pre-trained models.

Wed 11 May

Displayed time zone: Eastern Time (US & Canada) change

05:00 - 06:00
05:00
5m
Talk
Supporting program comprehension by generating abstract code summary tree
NIER - New Ideas and Emerging Results
Avijit Bhattacharjee University of Saskatchewan, Canada, Banani Roy University of Saskatchewan, Kevin Schneider University of Saskatchewan
DOI Pre-print Media Attached
05:05
5m
Talk
Practitioners’ Expectations on Automated Code Comment Generation
Technical Track
Xing Hu Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, Zhiyuan Wan Zhejiang University, Qiuyuan Chen Zhejiang University, Thomas Zimmermann Microsoft Research
DOI Pre-print Media Attached
05:10
5m
Talk
On the Evaluation of Neural Code Summarization
Technical Track
Ensheng Shi Xi'an Jiaotong University, Yanlin Wang Microsoft Research, Lun Du Microsoft Research Asia, Junjie Chen Tianjin University, Shi Han Microsoft Research, Hongyu Zhang University of Newcastle, Dongmei Zhang Microsoft Research, Hongbin Sun Xi'an Jiaotong University
DOI Pre-print Media Attached
05:15
5m
Talk
Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding
Technical Track
Deze Wang National University of Defense Technology, Zhouyang Jia National University of Defense Technology, Shanshan Li National University of Defense Technology, Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China, Yun Xiong Fudan University, Wei Dong School of Computer, National University of Defense Technology, China, Liao Xiangke National University of Defense Technology
Pre-print Media Attached
05:20
5m
Talk
FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation
Technical Track
Jinhao Dong Peking University, Yiling Lou Purdue University, Qihao Zhu Peking University, Zeyu Sun Peking University, Zhilin Li Peking University, Wenjie Zhang Peking University, Dan Hao Peking University
Pre-print Media Attached
21:00 - 22:00
21:00
5m
Talk
Supporting program comprehension by generating abstract code summary tree
NIER - New Ideas and Emerging Results
Avijit Bhattacharjee University of Saskatchewan, Canada, Banani Roy University of Saskatchewan, Kevin Schneider University of Saskatchewan
DOI Pre-print Media Attached
21:05
5m
Talk
Designing Divergent Thinking, Creative Problem Solving Exams
SEET - Software Engineering Education and Training
Jeff Offutt George Mason University, Kesina Baral George Mason University
Pre-print Media Attached
21:10
5m
Talk
Practitioners’ Expectations on Automated Code Comment Generation
Technical Track
Xing Hu Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, Zhiyuan Wan Zhejiang University, Qiuyuan Chen Zhejiang University, Thomas Zimmermann Microsoft Research
DOI Pre-print Media Attached
21:15
5m
Talk
Retrieving Data Constraint Implementations Using Fine-Grained Code Patterns
Technical Track
Juan Manuel Florez The University of Texas at Dallas, Jonathan Perry The University of Texas at Dallas, Shiyi Wei University of Texas at Dallas, Andrian Marcus University of Texas at Dallas
Pre-print Media Attached
21:20
5m
Talk
Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding
Technical Track
Deze Wang National University of Defense Technology, Zhouyang Jia National University of Defense Technology, Shanshan Li National University of Defense Technology, Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China, Yun Xiong Fudan University, Wei Dong School of Computer, National University of Defense Technology, China, Liao Xiangke National University of Defense Technology
Pre-print Media Attached

Information for Participants
Wed 11 May 2022 05:00 - 06:00 at ICSE room 2-odd hours - Program Comprehension 1 Chair(s): Prajish Prasad
Info for room ICSE room 2-odd hours:

Click here to go to the room on Midspace

Wed 11 May 2022 21:00 - 22:00 at ICSE room 1-odd hours - Program Comprehension 3 Chair(s): Christina von Flach
Info for room ICSE room 1-odd hours:

Click here to go to the room on Midspace