Zero-Shot Program Representation Learning (ICPC 2022 - Research)

Who

Nan Cui, Yuze Jiang, Xiaodong Gu, Beijun Shen

Track

ICPC 2022 Research

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sun 15 May 2022 22:30 - 22:37 at ICPC room - Session 2: Program Representation 1 Chair(s): Fatemeh Hendijani Fard

Abstract

Learning program representations has been the core prerequisite of code intelligent tasks such as code search and code clone detection. The state-of-the-art pre-trained models such as CodeBERT require the availability of large-scale code corpora. However, gathering training samples can be costly and infeasible for domain-specific languages such as Solidity for smart contracts. In this paper, we propose Zecoler, a zero-shot learning approach for code representations. Zecoler is built upon a pre-trained programming language model. In order to elicit knowledge from the pre-trained models efficiently, Zecoler casts the downstream tasks to the same form of pre-training tasks by inserting trainable prompts into the original input. Then, it employs the prompt learning technique which optimizes the pre-trained model by merely adjusting the original input. This enables the representation model to efficiently fit the scarce task-oriented data while reusing pre-trained knowledge. We evaluate Zecoler in three code intelligent tasks in two program languages that have no training samples, namely, Solidity and Go, with model trained in corpora of common languages such as Java. Experimental results show that our approach significantly outperforms baseline models in both zero-shot and few-shot settings.

Link to Preprint

https://arxiv.org/pdf/2204.08360.pdf

Nan Cui

Shanghai Jiao Tong University

Yuze Jiang

Shanghai Jiao Tong University

Xiaodong Gu

Shanghai Jiao Tong University, China

China

Beijun Shen

School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University

Media

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Sun 15 May
Displayed time zone: Eastern Time (US & Canada) change

22:30 - 23:20	Session 2: Program Representation 1Research at ICPC room Chair(s): Fatemeh Hendijani Fard University of British Columbia

22:30 7m Talk		Zero-Shot Program Representation Learning Research Nan Cui Shanghai Jiao Tong University, Yuze Jiang Shanghai Jiao Tong University, Xiaodong Gu Shanghai Jiao Tong University, China, Beijun Shen School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University Pre-print Media Attached
22:37 7m Talk		On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules Research Divyam Goel Indian Institute of Technology Roorkee, Ramansh Grover Delhi Technological University, Fatemeh Hendijani Fard University of British Columbia Pre-print Media Attached
22:44 7m Talk		Self-Supervised Learning of Smart Contract Representations Research Shouliang Yang School of Software, Shanghai Jiao Tong University, Beijun Shen School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Xiaodong Gu Shanghai Jiao Tong University, China Pre-print Media Attached
22:51 7m Talk		An Exploratory Study on Code Attention in BERT Research Rishab Sharma University of British Columbia, Fuxiang Chen University of British Columbia, Fatemeh Hendijani Fard University of British Columbia, David Lo Singapore Management University Pre-print Media Attached
22:58 7m Talk		Accurate Generation of Trigger-Action Programs with Domain-Adapted Sequence-to-Sequence Learning Research Imam Nur Bani Yusuf Singapore Management University, Lingxiao Jiang Singapore Management University, David Lo Singapore Management University DOI Pre-print Media Attached
23:05 15m Live Q&A		Q&A-Paper Session 2 Research

Information for Participants

Sun 15 May 2022 22:30 - 23:20 at ICPC room - Session 2: Program Representation 1 Chair(s): Fatemeh Hendijani Fard

Info for room ICPC room:

Click here to go to the room on Midspace