Navigating the landscape of AI test methods using taxonomy-based selection
Due to broad deployment of AI systems in risk-prone domains and AI regulations coming into effect, systematic risk- and quality assessments of AI systems have become increasingly important. Conducting such assessments involves identifying relevant quality criteria for a given AI system and selecting test methods, i.e., procedures for collecting and evaluating evidences and measurable quantities that fit the identified criteria. This selection process can be challenging due to the high complexity of the test method landscape and, in the context of independent audits, due to potential conflicts of interests between the involved stakeholders. To address this challenge, we expand existing frameworks on AI assessments with a systematic, taxonomy-based method of selecting test methods in independent audits suited for the given AI system and its application context. We evaluate our taxonomy on a subset of OECD metrics catalogue and demonstrate its applicability with two use cases.