ISSTA 2022
Mon 18 - Fri 22 July 2022 Online

Automatic program understanding and generation techniques could significantly advance the productivity of programmers and have been widely studied by academia and industry. Recently, the advent of pre-trained paradigm enlightens researchers to develop general-purpose pre-trained models which can be applied for a broad range of program understanding and generation tasks. Such pre-trained models, derived by self-supervised objectives on large unlabelled corpora, can be fine-tuned in downstream tasks (such as code search and code generation) with minimal adaptations. Although these pre-trained models claim superiority over the prior techniques, they seldom follow equivalent evaluation protocols, e.g., they are hardly evaluated on the identical benchmarks, tasks, or settings. Consequently, there is a pressing need for a comprehensive study of the pre-trained models on their effectiveness, versatility as well as the limitations to provide implications and guidance for the future development in this area. To this end, we first perform an extensive study of eight open-access pre-trained models over a large benchmark on seven representative code tasks to assess their reproducibility. We further compare the pre-trained models and domain-specific state-of-the-art techniques for validating pre-trained effectiveness. At last, we investigate the robustness of the pre-trained models by inspecting their performance variations under adversarial attacks. Through the study, we find that while we can in general replicate the original performance of the pre-trained models on their evaluated tasks and adopted benchmarks, subtle performance fluctuations can refute the findings in their original papers. Moreover, none of the existing pre-trained models can dominate over all other models. We also find that the pre-trained models can significantly outperform non-pre-trained state-of-the-art techniques in program understanding tasks. Furthermore, we perform the first study for natural language-programming language pre-trained model robustness via adversarial attacks and find that a simple random attack approach can easily fool the state-of-the-art pre-trained models and thus incur security issues. At last, we also provide multiple practical guidelines for advancing future research on pre-trained models for program understanding and generation.

Wed 20 Jul

Displayed time zone: Seoul change

03:00 - 04:00
Session 1-3: Oracles, Models, and Measurement ATechnical Papers at ISSTA 1
03:00
20m
Talk
Using Pre-trained Language Models to Resolve Textual and Semantic Merge Conflicts (Experience Paper)
Technical Papers
Jialu Zhang Yale University, Todd Mytkowicz Microsoft Research, Mike Kaufman Microsoft Corporation, Ruzica Piskac Yale University, Shuvendu Lahiri Microsoft Research
DOI
03:20
20m
Talk
Metamorphic Relations via Relaxations: An Approach to Obtain Oracles for Action-Policy Testing
Technical Papers
Hasan Ferit Eniser MPI-SWS, Timo P. Gros Saarland University, Germany, Valentin Wüstholz ConsenSys, Jörg Hoffmann Saarland University and DFKI, Germany, Maria Christakis MPI-SWS
DOI Pre-print
03:40
20m
Talk
An Extensive Study on Pre-trained Models for Program Understanding and Generation
Technical Papers
Zhengran Zeng Southern University of Science and Technology, Hanzhuo Tan Southern University of Science and Technology, The Hong Kong Polytechnic University, Haotian Zhang , Jing Li The Hong Kong Polytech University, Yuqun Zhang Southern University of Science and Technology, Lingming Zhang University of Illinois at Urbana-Champaign
DOI

Fri 22 Jul

Displayed time zone: Seoul change

16:40 - 17:40
Session 3-11: Oracles, Models, and Measurement CTechnical Papers at ISSTA 1
16:40
20m
Talk
An Extensive Study on Pre-trained Models for Program Understanding and Generation
Technical Papers
Zhengran Zeng Southern University of Science and Technology, Hanzhuo Tan Southern University of Science and Technology, The Hong Kong Polytechnic University, Haotian Zhang , Jing Li The Hong Kong Polytech University, Yuqun Zhang Southern University of Science and Technology, Lingming Zhang University of Illinois at Urbana-Champaign
DOI
17:00
20m
Talk
Metamorphic Relations via Relaxations: An Approach to Obtain Oracles for Action-Policy Testing
Technical Papers
Hasan Ferit Eniser MPI-SWS, Timo P. Gros Saarland University, Germany, Valentin Wüstholz ConsenSys, Jörg Hoffmann Saarland University and DFKI, Germany, Maria Christakis MPI-SWS
DOI Pre-print
17:20
20m
Talk
TELL: Log Level Suggestions via Modeling Multi-level Code Block Information
Technical Papers
Jiahao Liu National University of Singapore, Jun Zeng National University of Singapore, Xiang Wang University of Science and Technology of China, Kaihang Ji National University of Singapore, Zhenkai Liang National University of Singapore
DOI