Enhancing Identifier Naming Through Multi-Mask Fine-tuning of Language Models of CodeResearch Object ReviewedOpen Research Object
Code readability strongly influences code comprehension and, to some degree, code quality. Unreadable code makes software maintenance more challenging and is prone to more bugs. To improve the readability, using good identifier names is crucial. Existing studies on automatic identifier renaming have not considered aspects such as the code context. Additionally, prior research has done little to address the typical challenges inherent in the identifier renaming task. In this paper, we propose a new approach for renaming identifiers in source code by fine-tuning a transformer model. Through the use of perplexity as an evaluation metric, our results demonstrate a significant decrease in the perplexity values for the fine-tuned approach compared to the baseline, reducing them from 363 to 36. To further validate our method, we conduct a developers’ survey to gauge the suitability of the generated identifiers, comparing original identifiers with identifiers generated with our approach as well as two state-of-the-art large language models, GPT-4 Turbo and Gemini Pro. Our approach generates better identifier names than the original names and exhibits competitive performance with state-of-the-art commercial large language models. The proposed method carries significant implications for software developers, tool vendors, and researchers. Software developers may use our proposed approach to generate better variable names, increasing the clarity and readability of the software. Researchers in the field may use and build upon the proposed approach for variable renaming.
Mon 7 OctDisplayed time zone: Arizona change
15:30 - 17:00 | |||
15:30 16mResearch paper | Enhancing Recommendations of Composite Refactorings based on the Practice Research Track Ana Carla Bibiano Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Anderson Uchôa Federal University of Ceará, Daniel Tenório Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Daniel Coutinho Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Wesley Assunção North Carolina State University, Alessandro Garcia Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Baldoino Fonseca Federal University of Alagoas (UFAL), Márcio Ribeiro Federal University of Alagoas, Brazil, Thelma Elita Colanzi State University of Maringa, Brazil, Audrey Vasconcelos Federal University of Alagoas (UFAL), Rafael de Mello UFRJ, Brazil | ||
15:47 16mResearch paper | The Hidden Costs of Automation: An Empirical Study on GitHub Actions Workflow Maintenance Research Track Pablo Valenzuela-Toledo University of Bern, Universidad de La Frontera, Alexandre Bergel University of Chile, Oscar Nierstrasz feenk.com, Timo Kehrer University of Bern Pre-print | ||
16:04 16mResearch paper | Enhancing Identifier Naming Through Multi-Mask Fine-tuning of Language Models of CodeResearch Object ReviewedOpen Research Object Research Track Sanidhya Vijayvargiya BITS Pilani Hyderabad Campus, Mootez Saad Dalhousie University, Tushar Sharma Dalhousie University Pre-print | ||
16:21 16mResearch paper | Enhancing Security through Modularization: A Counterfactual Analysis of Vulnerability Propagation and Detection Precision Research Track Mohammad Mahdi Abdollahpour University of Waterloo, Jens Dietrich Victoria University of Wellington, Patrick Lam University of Waterloo Pre-print | ||
16:40 20mLive Q&A | Discussion (Maintainance) Research Track |