Uncovering Discrimination Clusters: Quantifying and Explaining Systematic Fairness Violations
Fairness in algorithmic decision-making is often framed in terms of individual fairness, which requires that similar individuals receive similar outcomes. A system violates individual fairness if there exists a pair of inputs differing only in protected attributes (such as race or gender) that lead to significantly different outcomes—for example, one favorable and the other unfavorable. While this notion highlights isolated instances of unfairness, it fails to capture broader patterns of systematic or \emph{clustered discrimination} that may affect entire subgroups.
We introduce and motivate the concept of \emph{discrimination clustering}, a generalization of individual fairness violations. Rather than detecting single counterfactual disparities, we seek to uncover regions of the input space where small perturbations in protected features lead to \emph{k-significantly distinct clusters} of outcomes. That is, for a given input, we identify a local neighborhood—differing only in protected attributes—whose members’ outputs separate into many distinct clusters. These clusters reveal significant arbitrariness in treatment solely based on protected attributes that help expose patterns of algorithmic bias that elude pairwise fairness checks.
We present HyFair, a hybrid technique that combines formal symbolic analysis (via SMT and MILP solvers) to certify individual fairness with randomized search to discover discriminatory clusters. This combination enables both formal guarantees—when no counterexamples exist—and the detection of severe violations that are computationally challenging for symbolic methods alone. Given a set of inputs exhibiting high k-unfairness, we introduce a novel explanation method to generate interpretable, decision-tree-style artifacts. Our experiments demonstrate that HyFair outperforms state-of-the-art fairness verification and local explanation methods. In particular, HyFair reveals that some benchmarks exhibit significant discrimination clustering, while others show limited or no disparities with respect to protected attributes. It also provides intuitive explanations to support understanding and mitigation of unfairness.