B-AIS: An Automated Process for Black-box Evaluation of AI-enabled Software Systems against Domain Semantics (ASE 2022 - Research Papers)

Who

Hamed Barzamini, Mona Rahimi

Track

ASE 2022 Research Papers

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 11 Oct 2022 10:30 - 10:50 at Ballroom C East - Technical Session 1 - AI for SE I Chair(s): Andrea Stocco

Abstract

AI-enabled software systems (AIS) are prevalent in a wide range of applications, such as visual tasks of autonomous systems, extensively deployed in automotive, aerial, and naval domains. Hence, it is crucial for human to evaluate the model’s intelligence before AIS is deployed to safety-critical environments, such as public roads.

In this paper, we assess AIS visual intelligence through measuring the completeness of its perception of primary concepts in a domain and the concept variants. For instance, is the visual perception of an autonomous detector mature enough to recognize the instances of \textit{pedestrian} (an automotive domain’s concept) in Halloween customs? An AIS will be more reliable once the model’s ability to perceive a concept is displayed in a human-understandable language. For instance, is the pedestrian in \textit{wheelchair} mistakenly recognized as a pedestrian on \textit{bike}, since the domain concepts bike and wheelchair, both associate with a mutual feature \textit{wheel}?

We answer the above-type questions by implementing a generic process within a framework, called B-AIS, which systematically evaluates AIS perception against the semantic specifications of a domain, while treating the model as a black-box. Semantics is the meaning and understanding of words in a language, and therefore, is more comprehensible for human brain than AIS pixel-level visual information. B-AIS processes the heterogeneous artifacts to be comparable, and leverages the comparison’s results to reveal AIS weaknesses in a human-understandable language. The evaluations of B-AIS for the vision task of pedestrian detection showed B-AIS identified the missing variants of the pedestrian with $F_{2}$ measures of 95% and in the dataset and 85% in the model.

Hamed Barzamini

Mona Rahimi

Northern Illinois University

United States