APSEC 2022
Tue 6 - Fri 9 December 2022
Wed 7 Dec 2022 13:20 - 13:40 at Room2 - Machine Learning 1 Chair(s): Syful Islam

The development of a question answering (QA) system for code can greatly facilitate programs understanding for developers. Recently, pre-trained language models (PLMs) have shown promising results in the code QA task. However, directly applying PLMs to code QA often causes suboptimal performance due to the large discrepancy between pre-training and the downstream QA tasks. While code PLMs are pre-trained on large-scale unlabeled code corpora, there is often a scarce availability of annotated QA pairs for fine-tuning. Existing code PLMs simply reuse the code representation part and require to train the QA part from scratch, which causes the model to overfit QA data. In this paper, we propose CodeMaster, a novel pre-training based approach for automatically answering code questions via task adaptation. CodeMaster employs CodeT5, a popular PLM for source code. In order to mitigate the gap between pre-training and QA, CodeMaster continually pre-trains CodeT5 on multiple self-supervised learning tasks such as partial comment completion and noun-phrase prediction. Experimental results on the CodeQA benchmark show that CodeMaster achieves state-of-the-art performance, and highlight the effectiveness of our approach.

Wed 7 Dec

Displayed time zone: Osaka, Sapporo, Tokyo change

13:00 - 14:00
Machine Learning 1Technical Track at Room2
Chair(s): Syful Islam Nara Institute of Science and Technology
Catch Me If You Can: Blackbox Adversarial Attacks on Automatic Speech Recognition using Frequency Masking
Technical Track
Xiaoliang Wu University of Edinburgh, Ajitha Rajan University of Edinburgh
Code Question Answering via Task-Adaptive Sequence-to-Sequence Pre-training
Technical Track
Tingrui Yu School of Software, Shanghai Jiao Tong University, Beijun Shen School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Xiaodong Gu Shanghai Jiao Tong University