On-the-fly Improving Performance of Deep Code Models via Input Denoising
Deep learning has been widely adopted to tackle various code-based tasks by building deep code models based on a large amount of code snippets. While these deep code models have achieved great success, even state-of-the-art models suffer from noise present in inputs leading to erroneous predictions. While it is possible to enhance models through retraining/fine-tuning, this is not a once-and-for-all approach and incurs significant overhead. In particular, these techniques cannot on-the-fly improve performance of (deployed) models. There are currently some techniques for input denoising in other domains (such as image processing), but since code input is discrete and must strictly abide by complex syntactic and semantic constraints, input denoising techniques in other fields are almost not applicable. In this work, we propose the first input denoising technique (i.e., CodeDenoise) for deep code models. Its key idea is to localize noisy identifiers in (likely) mispredicted inputs, and denoise such inputs by cleansing the located identifiers. It does not need to retrain or reconstruct the model, but only needs to cleanse inputs on-the-fly to improve performance. Our experiments on 18 deep code models (i.e., three pre-trained models with six code-based datasets) demonstrate the effectiveness and efficiency of CodeDenoise. For example, on average, CodeDenoise successfully denoises 21.91% of mispredicted inputs and improves the original models by 2.04% in terms of the model accuracy across all the subjects in an average of 0.48 second spent on each input, substantially outperforming the widely-used fine-tuning strategy.
(ASE23-1-CodeDenoise.pdf) | 3.46MiB |
Wed 13 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:30 - 12:00 | Code Quality and Code SmellsTool Demonstrations / Journal-first Papers / Research Papers at Plenary Room 2 Chair(s): Bernd Fischer Stellenbosch University | ||
10:30 12mTalk | Contextuality of Code Representation Learning Research Papers Yi Li New Jersey Institute of Technology, Shaohua Wang New Jersey Institute of Technology, Tien N. Nguyen University of Texas at Dallas | ||
10:42 12mTalk | On-the-fly Improving Performance of Deep Code Models via Input Denoising Research Papers Pre-print File Attached | ||
10:54 12mTalk | Using Deep Learning to Automatically Improve Code Readability Research Papers Antonio Vitale University of Molise, Italy, Valentina Piantadosi University of Molise, Simone Scalabrino University of Molise, Rocco Oliveto University of Molise Pre-print | ||
11:06 12mTalk | Towards Automatically Addressing Self-Admitted Technical Debt: How Far Are We? Research Papers Antonio Mastropaolo Università della Svizzera italiana, Massimiliano Di Penta University of Sannio, Italy, Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print File Attached | ||
11:18 12mTalk | How to Find Actionable Static Analysis Warnings: A Case Study with FindBugs Journal-first Papers Rahul Yedida , Hong Jin Kang UCLA, Huy Tu North Carolina State University, USA, Xueqi Yang NCSU, David Lo Singapore Management University, Tim Menzies North Carolina State University Link to publication DOI Authorizer link Pre-print | ||
11:30 12mTalk | Polyglot Code Smell Detection for Infrastructure as Code with GLITCH Tool Demonstrations Nuno Saavedra INESC-ID and IST, University of Lisbon, João Gonçalves INESC-ID and IST, University of Lisbon, Miguel Henriques INESC-ID and IST, University of Lisbon, João F. Ferreira INESC-ID and IST, University of Lisbon, Alexandra Mendes Faculty of Engineering, University of Porto & INESC TEC Pre-print File Attached | ||
11:42 12mTalk | Enhancing the defectiveness prediction of methods and classes via JIT Journal-first Papers Falessi Davide University of Rome Tor Vergata, Simone Mesiano Laureani University of Rome Tor Vergata, Jonida Çarka University of Rome Tor Vergata, Matteo Esposito University of Rome Tor Vergata, Daniel Alencar Da Costa University of Otago Link to publication DOI File Attached |