The MAP metric in Information Retrieval Fault Localization (ASE 2023 - Research Papers)

Who

Thomas Hirsch, Birgit Hofer

Track

ASE 2023 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 14 Sep 2023 14:06 - 14:18 at Room E - Debugging Chair(s): Carol Hanna

Abstract

The MAP (Mean Average Precision) metric is one of the most popular performance metrics in the field of Information Retrieval Fault Localization (IRFL). However, there are problematic implementations of this MAP metric used in IRFL research. These implementations deviate from the text book definitions of MAP, rendering the metric sensitive to the truncation of retrieval results and inaccuracies and impurities of the used datasets. The application of such a deviating metric can lead to performance overestimation. This can pose a problem for comparability, transferability, and validity of IRFL performance results. In this paper, we discuss the definition and mathematical properties of MAP and common deviations and pitfalls in its implementation. We investigate and discuss the conditions enabling such overestimation: the truncation of retrieval results in combination with ground truths spanning multiple files and improper handling of undefined AP results. We demonstrate the overestimation effects using the Bench4BL benchmark and five well known IRFL techniques. Our results indicate that a flawed implementation of the MAP metric can lead to an overestimation of the IRFL performance, in extreme cases by up to 70 %. We argue for a strict adherence to the text book version of MAP with the extension of undefined AP values to be set to 0 for all IRFL experiments. We hope that this work will help to improve comparability and transferability in IRFL research.

File attachments

The MAP Metric in Information Retrieval Fault Localization (conference_101719.pdf)	293KiB

Thomas Hirsch

Graz University of Technology

Austria

Birgit Hofer