RepresentThemAll: A Universal Learning Representation of Bug Reports
Deep learning techniques have shown promising performance in automated software maintenance tasks associated with bug reports. Currently, all existing studies specifically learn the customized representation of bug reports for a specific downstream task. Despite early success, training multiple models for multiple downstream tasks face three issues: complexity, cost, and compatibility, due to the customization, disparity, uniqueness of these automated approaches. To resolve the above challenges, we propose RepresentThemAll, a pre-trained approach that can learn the universal representation of bug reports and handle multiple downstream tasks. Specifically, RepresentThemAll is a universal bug report framework that is pre-trained with two carefully designed learning objectives: one is the dynamic masked language model and another one is a contrastive learning objective, “find yourself”. We evaluate the performance of RepresentThemAll on four downstream tasks, including duplicate bug report detection, bug report summarization, bug priority prediction, and bug severity prediction. Our experimental results show that RepresentThemAll outperforms all baseline approaches on all considered downstream tasks after well-designed fine-tuning.