Write a Blog >>
ICPC 2022
Mon 16 - Tue 17 May 2022
co-located with ICSE 2022
Mon 16 May 2022 22:32 - 22:39 at ICPC room - Session 10: Code Clones Chair(s): Chaiyong Ragkhitwetsagul

Current autograders of programming assignments are typically program output based; they fall short in many ways: e.g. they do not carry out subjective evaluations such as code quality, or whether the code has followed any instructor specified constraints; this is still done manually by teaching assistants. In this paper, we tackle a specific aspect of such evaluation: to verify whether a program implements a specific \emph{algorithm} that the instructor specified. An algorithm, {\em e.g.} bubble sort, can be coded in myriad different ways, but a human can always understand the code and spot, say a bubble sort, {\em vs.} a selection sort. We develop and compare four approaches to do precisely this: given the source code of a program known to implement a certain functionality, \emph{identify the algorithm} used, among a known set of algorithms. The approaches are based on code similarity, Support Vector Machine (SVM) with tree or graph kernels, and transformer neural architectures based only source code (CodeBERT), and the extension of this that includes code structure (GraphCodeBERT). We further use a model for explainability (LIME) to generate insights into why certain programs get certain labels. Results based on our datasets of sorting, searching and shortest path codes, show that GraphCodeBERT, fine-tuned with \emph{scrambled source code}, {\em i.e.}, where identifiers are replaced consistently with arbitrary words, gives the best performance in algorithm identification, with accuracy of 96-99% depending on the functionality, including correct classification of obfuscated source code.

Mon 16 May

Displayed time zone: Eastern Time (US & Canada) change

22:00 - 22:50
Session 10: Code ClonesResearch / Early Research Achievements (ERA) at ICPC room
Chair(s): Chaiyong Ragkhitwetsagul Mahidol University, Thailand
22:00
7m
Talk
C4: Contrastive Cross-Language Code Clone Detection
Research
Chenning Tao Zhejiang University, Qi Zhan Zhejiang University, Xing Hu Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab
DOI Pre-print Media Attached
22:07
7m
Talk
Predicting Change Propagation between Code Clone Instances by Graph-based Deep Learning
Research
Bin Hu Fudan University, Yijian Wu Fudan University, Xin Peng Fudan University, Chaofeng Sha Fudan University, Xiaocheng Wang Fudan University, Baiqiang Fu Fudan University, Wenyun Zhao Fudan University, China
Media Attached File Attached
22:14
4m
Talk
An Exploratory Study of Analyzing JavaScript Online Code Clones
Early Research Achievements (ERA)
Md Rakib Hossain Misu University of California, Irvine, Abdus Satter University of Dhaka
DOI Pre-print Media Attached
22:18
7m
Talk
Exploring and Understanding Cross-service Code Clones in Microservice Projects
Research
Yang Zhao Central China Normal University, Ran Mo Central China Normal University, Yao Zhang Central China Normal University, Siyuan Zhang Central China Normal University, Pu Xiong Central China Normal University
Media Attached
22:25
7m
Talk
MSCCD: Grammar Pluggable Clone Detection Based on ANTLR Parser Generation
Research
Wenqing ZHU Nagoya University, Norihiro Yoshida Ritsumeikan University, Toshihiro Kamiya Shimane University, Eunjong Choi Kyoto Institute of Technology, Hiroaki Takada Nagoya University
Pre-print Media Attached
22:32
7m
Talk
Algorithm Identification in Programming Assignments
Research
Pranshu Chourasia Indian Institute of technology - Bombay, Ganesh Ramakrishnan Indian Institute of technology - Bombay, Varsha Apte Indian Institute of technology - Bombay, Suraj Kumar Indian Institute of technology - Bombay
Media Attached
22:39
11m
Live Q&A
Q&A-Paper Session 10
Research


Information for Participants
Mon 16 May 2022 22:00 - 22:50 at ICPC room - Session 10: Code Clones Chair(s): Chaiyong Ragkhitwetsagul
Info for room ICPC room:

Click here to go to the room on Midspace