Towards Enhancing Task Prioritization in Software Development Through Transformer-Based Issues Classification
This program is tentative and subject to change.
Prioritizing tasks is extremely beneficial, but also difficult for software development teams. Assigning priority to tasks is also timeconsuming, especially in projects with a high volume of new issues. Consequently, many issues in GitHub are not labelled. An effective priority tool can streamline this process by suggesting priority labels, saving developers’ time and enabling faster identification of high-impact product improvements. In this paper we investigate the application of text classification using Transformer models to automatically assign priority labels to software development issues. We used data from the GitHub and Jira vast datasets to develop state-of-the-art machine learning models (Transformers) to automatically classify the priority of text issues. We thoroughly evaluated the generalizability of our models by using issues that are self-tagged by developers in projects that were not part of the training (Out-of-Distribution) and we adapted our models to specific projects by incorporating part of the issues in the training (fine-tuning) to improve performance. Our experiments show that results vary but can reach a performance of correctly labeling 80% of high priority issues in a project. Our results indicate that Transformers have the potential to assist developers in (semi-)automatically assigning priority labels to their issues and therefore reducing overhead. We find that fine-tuning improve significantly the performance by adapting the machine learning models to specific projects, but further research is needed to optimize this approach.