Building Bridges, Not Walls: Fairness-aware and Accurate Recommendation of Code Reviewers via LLM-based Agents Collaboration
Code review is essential for maintenance of pull request-based software systems. Recommending suitable reviewers for code changes can facilitate defect detection and knowledge dissemination. Despite extensive research, the inherent complexity of pull requests (PRs) and reviewer profiles continues to cause challenge for accurate matching them together. Furthermore, existing methods often amplify gender and racial/ethnic disparities due to the lack of attention to biases present in historical review records. To address these issues, we first collected a dataset from 4 large-scale open-source projects involving 50-month revision history, reaching up to 30 attributes. This dataset includes gender and racial/ethnic information, which was inferred, validated, and incorporated to enable comprehensive data bias analysis in reviewer recommendation tasks. Additionally, we introduce a fairness-aware and accurate approach: CoReBM, which leverages the advanced semantic understanding capabilities of Large Language Models (LLMs) to comprehensively capture the nuanced textual context of both PRs and reviewers, utilizing the robust planning, collaborative, and decision-making abilities of multi-agent systems. CoReBM integrates diverse factors to improve recommendation performance while mitigating bias effects through the incorporation of candidates’ gender and racial/ethnic attributes. We evaluate the effectiveness of our approach on this dataset, and the results demonstrate that CoReBM outperforms state-of-the-art methods in both accuracy and fairness in recommendation.