Fine-Grained Code-Comment Semantic Interaction Analysis
Code comment, i.e., the natural language text to describe code, is considered as a killer for program comprehension. Current literature approaches mainly focus on comment generation or comment update, and thus fall short on explaining which part of the code leads to a specific content in the comment. In this paper, we propose that addressing such a challenge can better facilitate code understanding. We propose FOSTERER, a Fine-grained code-comment Semantic interaction analyzer , which can build fine-grained semantic interactions between code statements and comment tokens. It not only leverages the advanced deep learning techniques like cross-modal learning and contrastive learning, but also borrows the weapon of pre-trained vision models. Specifically, it mimics the comprehension practice of developers, treating code statements as image patches and comments as texts, and uses contrastive learning to match the semantically-related part between the visual and textual information. Experiments on a large-scale manually-labelled dataset show that our approach can achieve an F1-score around 80%, and such a performance exceeds a heuristic-based baseline to a large extent. We also find that FOSTERER can work with a high efficiency, i.e., it only needs 1.5 second for inferring the results for a code-comment pair. Furthermore, a user study demonstrates its usability: for 65% cases, its prediction results are considered as useful for improving code understanding. Therefore, our research sheds light on a promising direction for program comprehension.
Tue 17 MayDisplayed time zone: Eastern Time (US & Canada) change
07:50 - 08:40 | Session 14: DocumentationResearch / Early Research Achievements (ERA) / Tool Demonstration at ICPC room Chair(s): Fiorella Zampetti University of Sannio, Italy | ||
07:50 7mTalk | Fine-Grained Code-Comment Semantic Interaction Analysis Research Mingyang Geng National University of Defense Technology, Shangwen Wang National University of Defense Technology, Dezun Dong NUDT, Shanzhi Gu Hunan Huishiwei Intelligent Technology Co., Ltd., Fang Peng University of Chinese Academy of Sciences, Weijian Ruan Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences, Liao Xiangke National University of Defense Technology DOI Pre-print Media Attached | ||
07:57 4mTalk | Using Discord Conversations as Program Comprehension Aid Early Research Achievements (ERA) Marco Raglianti Software Institute - USI, Lugano, Csaba Nagy Software Institute - USI, Lugano, Roberto Minelli Software Institute - USI, Lugano, Michele Lanza Software Institute - USI, Lugano Media Attached | ||
08:01 7mTalk | Demystifying Software Release Note Issues on GitHub Research Jianyu Wu Peking University, Hao He Peking University, Wenxin Xiao School of Computer Science, Peking University, Kai Gao University of Science and Technology Beijing, Minghui Zhou Peking University, China Pre-print Media Attached | ||
08:08 4mTalk | A First Look at Duplicate and Near-duplicate Self-admitted Technical Debt Comments Early Research Achievements (ERA) Jerin Yasmin Queen's University, Canada, Mohammad Sadegh Sheikhaei Queen's University, Yuan Tian Queens University, Kingston, Canada Pre-print Media Attached | ||
08:12 7mTalk | HatCUP: Hybrid Analysis and Attention based Just-In-Time Comment Updating Research Hongquan Zhu State Key Laboratory for Novel Software Technology, Nanjing University, Xincheng He State Key Laboratory for Novel Software Technology, Nanjing University, Lei Xu State Key Laboratory for Novel Software Technology, Nanjing University DOI Pre-print Media Attached | ||
08:19 4mTalk | Casdoc: Unobtrusive Explanations in Code Examples Tool Demonstration Mathieu Nassif McGill University, Zara Horlacher McGill University, Martin P. Robillard McGill University Pre-print Media Attached | ||
08:23 17mLive Q&A | Q&A-Paper Session 14 Research |