The MAP (Mean Average Precision) metric is one of the most popular performance metrics in the field of Information Retrieval Fault Localization (IRFL). However, there are problematic implementations of this MAP metric used in IRFL research. These implementations deviate from the text book definitions of MAP, rendering the metric sensitive to the truncation of retrieval results and inaccuracies and impurities of the used datasets. The application of such a deviating metric can lead to performance overestimation. This can pose a problem for comparability, transferability, and validity of IRFL performance results. In this paper, we discuss the definition and mathematical properties of MAP and common deviations and pitfalls in its implementation. We investigate and discuss the conditions enabling such overestimation: the truncation of retrieval results in combination with ground truths spanning multiple files and improper handling of undefined AP results. We demonstrate the overestimation effects using the Bench4BL benchmark and five well known IRFL techniques. Our results indicate that a flawed implementation of the MAP metric can lead to an overestimation of the IRFL performance, in extreme cases by up to 70 %. We argue for a strict adherence to the text book version of MAP with the extension of undefined AP values to be set to 0 for all IRFL experiments. We hope that this work will help to improve comparability and transferability in IRFL research.
The MAP Metric in Information Retrieval Fault Localization (conference_101719.pdf) | 293KiB |
Thu 14 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
13:30 - 15:00 | DebuggingResearch Papers / Industry Showcase (Papers) at Room E Chair(s): Carol Hanna University College London | ||
13:30 12mTalk | Coding and Debugging by Separating Secret Code toward Secure Remote Development Industry Showcase (Papers) Shinobu Saito NTT Media Attached File Attached | ||
13:42 12mTalk | Detecting Memory Errors in Python Native Code by Tracking Object Lifecycle with Reference Count Research Papers Xutong Ma State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China, Jiwei Yan Institute of Software at Chinese Academy of Sciences, China, Hao Zhang Institute of Software, Chinese Academy of Sciences, Jun Yan Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Jian Zhang Institute of Software at Chinese Academy of Sciences; University of Chinese Academy of Sciences Pre-print | ||
13:54 12mResearch paper | PERFCE: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis Research Papers Zhenlan Ji The Hong Kong University of Science and Technology, Pingchuan Ma HKUST, Shuai Wang Hong Kong University of Science and Technology Pre-print | ||
14:06 12mTalk | The MAP metric in Information Retrieval Fault Localization Research Papers Media Attached File Attached | ||
14:18 12mTalk | Eiffel: Inferring Input Ranges of Significant Floating-point Errors via Polynomial ExtrapolationRecorded talk Research Papers Zuoyan Zhang Information Engineering University, Bei Zhou Information Engineering University, Jiangwei Hao Information Engineering University, Hongru Yang Information Engineering University, Mengqi Cui Information Engineering University, Yuchang Zhou Information Engineering University, Guanghui Song Information Engineering University, Fei Li Information Engineering University, Jinchen Xu Information Engineering University, Jie Zhao State Key Laboratory of Mathematical Engineering and Advanced Computing Media Attached File Attached | ||
14:30 12mTalk | Information Retrieval-based Fault Localization for Concurrent ProgramsRecorded talk Research Papers Pre-print Media Attached |