Write a Blog >>
MSR 2022
Mon 23 - Tue 24 May 2022
co-located with ICSE 2022
Thu 19 May 2022 21:32 - 21:36 at MSR Main room - odd hours - Session 13: Security & Quality Chair(s): Gias Uddin

Context: Code Clone Detection (CCD) is a software engineering task that is used for plagiarism detection, code search, and code comprehension. Recently, deep learning-based models have achieved an F1 score (a metric used to assess classifiers) of $\sim$95% on the CodeXGLUE benchmark. These models require many training data, mainly fine-tuned on Java or C++ datasets. However, no previous study evaluates the generalizability of these models where a limited amount of annotated data is available.

Objective: The main objective of this research is to assess the ability of the CCD models as well as few shot learning algorithms for unseen programming problems and new languages (i.e., the model is not trained on these problems/languages).

\textit{Method:} We assess the generalizability of the state of the art models for CCD in few shot settings (i.e., only a few samples are available for fine-tuning) by setting three scenarios: i) unseen problems, ii) unseen languages, iii) combination of new languages and new problems. We choose three datasets of BigCloneBench, POJ-104, and CodeNet and Java, C++, and Ruby languages. Then, we employ Model Agnostic Meta-learning (MAML), where the model learns a meta-learner capable of extracting transferable knowledge from the train set; so that the model can be fine-tuned using a few samples. Finally, we combine contrastive learning with MAML to further study whether it can improve the results of MAML.

Thu 19 May

Displayed time zone: Eastern Time (US & Canada) change

21:00 - 21:50
Session 13: Security & QualityTechnical Papers / Data and Tool Showcase Track / Registered Reports / Industry Track at MSR Main room - odd hours
Chair(s): Gias Uddin University of Calgary, Canada
On the Use of Fine-grained Vulnerable Code Statements for Software Vulnerability Assessment Models
Technical Papers
Triet Le The University of Adelaide, Muhammad Ali Babar University of Adelaide
LineVD: Statement-level Vulnerability Detection using Graph Neural Networks
Technical Papers
David Hin The University of Adelaide, Andrey Kan The University of Adelaide, Huaming Chen The University of Adelaide, Muhammad Ali Babar University of Adelaide
LineVul: A Transformer-based Line-Level Vulnerability Prediction
Technical Papers
Michael Fu Monash University, Kla Tantithamthavorn Monash University
ECench: An Energy Bug Benchmark of Ethereum Client Software
Data and Tool Showcase Track
Jinyoung Kim Sungkyunkwan University, Misoo Kim Sungkyunkwan University, Eunseok Lee Sungkyunkwan University
Microsoft CloudMine: Data Mining for the Executive Order on Improving the Nation’s Cybersecurity
Industry Track
Kim Herzig Tools for Software Engineers, Microsoft, Luke Gostling Microsoft Corporation, Maximilian Grothusmann Microsoft Corporation, Nora Huang Microsoft Corporation, Sascha Just Microsoft, Alan Klimowski Microsoft Corporation, Yashasvini Ramkumar Microsoft Corporation, Myles McLeroy Microsoft Corporation, Kıvanç Muşlu Microsoft, Hitesh Sajnani Microsoft , Varsha Vadaga Microsoft Corporation
Evaluating few shot and Contrastive learning Methods for Code Clone Detection
Registered Reports
Mohamad Khajezade University of British Columbia, Fatemeh Hendijani Fard University of British Columbia, Mohamed S Shehata University of British Columbia
Live Q&A
Discussions and Q&A
Technical Papers

Information for Participants
Thu 19 May 2022 21:00 - 21:50 at MSR Main room - odd hours - Session 13: Security & Quality Chair(s): Gias Uddin
Info for room MSR Main room - odd hours:

Click here to go to the room on Midspace