Is Diversity a Meaningful Metric in Fairness Testing?
Abstract—Background: Individual fairness testing aims to identify individual discriminatory instances (IDIs) to improve the fairness of machine learning classifiers through retraining. While prior studies have primarily evaluated fairness testing algorithms based on efficiency and retraining performance, emerging metrics such as the diversity of identified IDIs have recently been proposed. However, these alternative metrics have not yet been established as standard evaluation criteria.
Aims: This study investigates the significance of IDI diversity as a metric for evaluating fairness testing algorithms. Specifically, we aim to validate its utility as a core evaluation metric by examining its correlation with both fairness improvement and accuracy degradation in retrained classifiers.
Method: We conduct an empirical study using a newly developed framework called REDI. This framework generates multiple IDI sets with controlled variations in diversity, enabling systematic evaluation of their impact on retrained classifiers. We apply REDI to analyze the correlations between IDI diversity and retraining outcomes. The validity of the framework is further supported through auxiliary empirical analyses.
Results: Our experiments confirm that IDI diversity exhibits a moderate correlation with fairness improvement, while showing only a weak correlation with accuracy degradation. Additionally, our regression analysis indicates that the actual impact of diversity on fairness improvement is substantial, whereas its impact on accuracy degradation is relatively negligible.
Conclusions: These results show that higher IDI diversity substantially enhances fairness with minimal accuracy loss, suggesting that it should be adopted as a meaningful proxy metric for evaluating fairness testing algorithms, complementing the established metrics.