Improve the Performance of Large Language Models on Code Generation
Automatic code generation is an advanced application of program understanding and is considered as a crucial method to improve the automation level and quality of software development. Researchers have recently extended the application of large language models (LLMs) to code generation, with impressive results. However, the code generated by these models might not always align with developers’ specific requirements, and it is challenging to make necessary modifications to the model, since the LLMs are often black-box and require huge computation resources. To address this problem, I plan to conduct post-processing to the output of LLMs. In this proposal, I first conduct a literature review of the field. Moreover, two potential directions for mitigating this problem will be proposed.
My name is Jinhao Dong, currently a PhD student at Peking University, scheduled to graduate in 2025. My research interests primarily lie in deep learning and software testing. My PhD research focuses on collaborative software development, which is essential to improve productivity when working on large-scale projects. I have introduced fine-grained structured representations for code changes in commit message generation (ICSE22) and conflict resolutions in merge conflict resolution (ASE23). Additionally, I have proposed specialized neural networks, including a graph neural network and dual copy mechanism for commit message generation (ICSE22) and generative models for merge conflict resolution (ASE23). Moreover, I have proposed a pattern-based approach to evaluate generated commit messages by matching the patterns to reflect their details and distribution (ICSE23).
Furthermore, I have devised a generative adversarial network called MarginGAN, which leverages the margin theory to enhance the accuracy of semi-supervised classifiers (NeurIPS19). I have also proposed a new direction to accelerate regression testing by reusing program states and skipping unnecessary program executions (ASE20 NIER Track). Other areas of my research include fault localization (FSE21) and test case reduction (ISSRE20).
Mon 11 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
13:30 - 15:00 | |||
13:30 30mTalk | Deferring Partial Analysis Execution for Soundness Doctoral Forum Anemone Kampkötter TU Dortmund | ||
14:00 30mTalk | Improve the Performance of Large Language Models on Code Generation Doctoral Forum Jinhao Dong Peking University | ||
14:30 30mTalk | Analysis and Tool-Support for Scalable and Reliable Imperative Deep Learning Programs Doctoral Forum Tatiana Castro Vélez City University of New York (CUNY) Graduate Center |