TCSE logo 
 Sigsoft logo
Sustainability badge

Developers rely on code comments to document their work, track issues, and understand the source code. As such, comments provide valuable insights into developers’ understanding of their code and describe their various intentions in writing the surrounding code. Recent research leverages natural language processing and deep learning to classify comments based on developers’ intentions. However, such labelled data are often imbalanced, causing learning models to perform poorly. This work investigates the use of different weighting strategies of the loss function to mitigate the scarcity of certain classes in the dataset. In particular, various RoBERTa-based transformer models are fine-tuned by means of a hyperparameter search to identify their optimal parameter configurations. Additionally, we fine-tuned the transformers with different weighting strategies for the loss function to address class imbalances. Our approach outperforms the STACC baseline by 8.9 per cent on the NLBSE’25 Tool Competition dataset in terms of the average F1 score, and exceeding the baseline approach in 17 out of 19 cases with a gain ranging from -5.0 to 38.2. The source code is publicly available at https://github.com/moritzmock/NLBSE2025.

Sun 27 Apr

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30
Session 2 - Tool competitionNLBSE at 214
Chair(s): Maliheh Izadi Delft University of Technology, Sebastiano Panichella University of Bern, Giuseppe Colavito University of Bari, Pooja Rani University of Zurich, Ali Al-Kaswan Delft University of Technology, Netherlands, Nataliia Stulova MacPaw
14:00
10m
Other
Opening & Code Comment Classification Competition
NLBSE
Giuseppe Colavito University of Bari, Pooja Rani University of Zurich, Ali Al-Kaswan Delft University of Technology, Netherlands, Nataliia Stulova MacPaw
14:10
10m
Demonstration
Code Comment Classification with Data Augmentation and Transformer-Based Models
NLBSE
Mushfiqur Rahman , Mohammed Latif Siddiq University of Notre Dame
14:20
10m
Demonstration
GRAPHiC: Utilizing Graph Structures and Class Weights in Code Comment Classification with Pretrained BERT Models
NLBSE
Pir Sami Ullah Shah FAST National University, Shahela Saif , Muhammad Haris Athar FAST National University, Muhammad Riyaan Tariq National University of Computer and Emerging Sciences, Islamabad, Pakistan, Abdur Rehman Afzal FAST National University
14:30
10m
Demonstration
Evaluating the Performance and Efficiency of Sentence-BERT for Code Comment Classification
NLBSE
Fabian C. Peña University of Passau, Steffen Herbold University of Passau
14:40
10m
Demonstration
Optimizing Deep Learning Models to Address Class Imbalance in Code Comment Classification
NLBSE
Moritz Mock Free University of Bozen-Bolzano, Thomas Borsani , Giuseppe Di Fatta , Barbara Russo Free University of Bozen/Bolzano, Italy
Pre-print
14:50
10m
Demonstration
CodeComClassify: Automating Code Comments Classification using BERT-based Language Models
NLBSE
Khubaib Amjad Alam , Wajid Ali , Nadeem Abbas Linnaeus University, Muhammad Haroon FAST National University, Summan Aziz , Meer Hashaam Khan FAST National University, Zahoor Ahmad
15:00
10m
Other
Competition Closing
NLBSE
Giuseppe Colavito University of Bari, Pooja Rani University of Zurich, Ali Al-Kaswan Delft University of Technology, Netherlands, Nataliia Stulova MacPaw
15:10
10m
Day closing
Workshop closing
NLBSE
Maliheh Izadi Delft University of Technology, Sebastiano Panichella University of Bern
:
:
:
: