Practical considerations and solutions in NLP-based analysis of code review comments - An experience report
This program is tentative and subject to change.
Context: Automated analysis of code review comments (CRCs) can aid in highlighting frequently discussed issues by reviewers from large repositories. However, CRCs contain natural language text and code references; thus, topic modeling approaches must be carefully selected. Objective: This work aims to discuss the various challenges observed while evaluating two topic modeling methods for the analysis of CRCs. Method: We evaluated GSDMM and BERTopic to analyze frequently discussed themes in CRCs. We utilized 5,560 CRCs, followed by an evaluation of the quality of themes from a domain expert. Results: We report several observations and challenges in improving the quality of the generated themes, including choices regarding the pre-processing, topic modeling parameters, embedding model, and objective measures used, which impact the interpretability of the generated topics. Conclusions: This work raises important questions regarding the approach for analysis of CRCs and provides potential avenues and suggestions for further exploration. Future studies can utilize the technical demonstrator to explore the interpretability of the generated topics from CRCs.