RovoDev Code Reviewer: A Large-Scale Online Evaluation of LLM-based Code Review Automation at Atlassian
Large Language Models (LLMs)-powered code review automation has the potential to transform code review workflows. Despite the advances of LLM-powered code review comment generation approaches, several practical challenges remain for designing enterprise-grade code review automation tools. In particular, this paper aims at answering the practical question: how can we design a review-guided, context-aware, quality-checked code review comment generation without fine-tuning? In this paper, we present RovoDev Code Reviewer, an enterprise-grade LLM-based code review automation tool designed and deployed at scale within Atlassian’s development ecosystem with seamless integration into Atlassian’s Bitbucket. Through the offline, online, user feedback evaluations over a one-year period, we conclude that RovoDev Code Reviewer (1) performs reasonably in review localization, though its generated comments may differ from those written by humans; and (2) offers the promise of accelerating feedback cycles (i.e., decreasing the PR cycle time), alleviating reviewer workload (i.e., reducing the number of human-written comments), and improving overall software quality (i.e., finding errors with actionable suggestions).