Write a Blog >>
ICPC 2022
Mon 16 - Tue 17 May 2022
co-located with ICSE 2022
Sun 15 May 2022 21:30 - 21:37 at ICPC room - Session 1: Summarization Chair(s): Haipeng Cai

Stack Overflow is often viewed as the most influential SoftwareQuestion & Answer (SQA) website with millions of programming-related questions and answers. Tags play a critical role in efficiently structuring the contents in Stack Overflow and are vital to support a range of site operations, e.g., querying relevant contents. Poorly selected tags often introduce extra noise and redundancy, which leads to tag synonym and tag explosion problems. Thus, an automated tag recommendation technique that can accurately recommend high-quality tags is desired to alleviate the problems mentioned above. Inspired by the recent success of pre-trained language models(PTMs) in natural language processing (NLP), we present PTM4Tag, a tag recommendation framework for Stack Overflow posts that utilize PTMs with a triplet architecture, which models the components of a post, i.e., Title, Description, and Code with independent language models. To the best of our knowledge, this is the first work that leverages PTMs in the tag recommendation task of SQA sites. We comparatively evaluate the performance of PTM4Tag on five popular pre-trained models: three models trained on general domain textual data, i.e., BERT, RoBERTa, and ALBERT, and two SE domain-specific models, i.e., CodeBERT and BERTOverflow. Our results show that leveraging the SE-specific PTM CodeBERT in PTM4Tag can achieve the best performance among the five considered PTMs. Surprisingly, another SE-specific PTM BERTOverflow performs much worse than the above-mentioned BERT, RoBERTa, and CodeBERT. Furthermore, PTM4Tag that is implemented with CodeBERT outperforms the state-of-the-art approach (based on Convolutional Neural Network) by a large margin in terms of average π‘ƒπ‘Ÿπ‘’π‘π‘–π‘ π‘–π‘œπ‘›@π‘˜,π‘…π‘’π‘π‘Žπ‘™π‘™@π‘˜, and 𝐹1-π‘ π‘π‘œπ‘Ÿπ‘’@π‘˜. More specifically, the 𝐹1-π‘ π‘π‘œπ‘Ÿπ‘’@5 is boosted by 15.3%. Furthermore, we conduct an ablation study to quantify the contribution of a post’s constituent components (Title, Description, and Code Snippets) to the performance of PTM4Tag. Our results show that Title is the most important in predicting the most relevant tags, and utilizing all the components achieves the best performance.

Sun 15 May

Displayed time zone: Eastern Time (US & Canada) change

21:30 - 22:20
Session 1: SummarizationResearch at ICPC room
Chair(s): Haipeng Cai Washington State University, USA
21:30
7m
Talk
PTM4Tag: Sharpening Tag Recommendation of Stack Overflow with Pre-trained Models
Research
Junda He Singapore Management University, Bowen Xu Singapore Management University, Zhou Yang Singapore Management University, DongGyun Han Singapore Management University, Chengran Yang Singapore Management University, David Lo Singapore Management University
Media Attached
21:37
7m
Talk
GypSum: Learning Hybrid Representations for Code Summarization
Research
Yu Wang School of Data Science and Engineering, East China Normal University, Yu Dong School of Data Science and Engineering, East China Normal University, Xuesong Lu School of Data Science and Engineering, East China Normal University, Aoying Zhou East China Normal University
DOI Pre-print Media Attached
21:44
7m
Talk
M2TS: Multi-Scale Multi-Modal Approach Based on Transformer for Source Code Summarization
Research
Yuexiu Gao Shandong Normal University, Chen Lyu Shandong Normal University
Media Attached
21:51
7m
Talk
Semantic Similarity Metrics for Evaluating Source Code Summarization
Research
Sakib Haque University of Notre Dame, Zachary Eberhart University of Notre Dame, Aakash Bansal University of Notre Dame, Collin McMillan University of Notre Dame
Media Attached
21:58
7m
Talk
LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognition
Research
Rishab Sharma University of British Columbia, Fuxiang Chen University of British Columbia, Fatemeh Hendijani Fard University of British Columbia
Pre-print Media Attached
22:05
15m
Live Q&A
Q&A-Paper Session 1
Research


Information for Participants
Sun 15 May 2022 21:30 - 22:20 at ICPC room - Session 1: Summarization Chair(s): Haipeng Cai
Info for room ICPC room:

Click here to go to the room on Midspace