Combining Coverage and Expert Features with Semantic Representation for Coincidental Correctness Detection
Coincidental correctness (CC) can be misleading for developers because it gives the impression that the code is functioning correctly when there are hidden faults. To mitigate the negative impacts of CC test cases, extensive research has been conducted on their detection, employing either coverage-based or expert-based features. These studies have yielded promising results. Coverage and expert features each provide unique insights into program execution, yet the literature has not fully explored the combined potential of these two feature sets to enhance the detection of CC. Additionally, the rich semantics of test code and focal method have not been fully utilized. Therefore, we propose to build a unified model, CORE, that integrates coverage and expert features with semantic representations of test and focal methods to improve the detection of CC test cases. We make a comprehensive evaluation with six state-of-the-art baselines on the widely-used Defects4J benchmark. The experimental results show that CORE outperforms the baselines in terms of CC detection accuracy with a substantial improvement (i.e., 40% improvement on average in terms of F1 score). Then, we conduct the ablation experiment to show that the coverage, expert, and semantics have contribution to CORE. CORE can also improve the effectiveness of spectrum-based and mutation-based fault localization performance (e.g., 50% improvements for spectrum-based formula Dstar and 44% improvements for mutation-based method MUSE under relabeling strategy).
Wed 30 OctDisplayed time zone: Pacific Time (US & Canada) change
| 10:30 - 12:00 | |||
| 10:3015m Talk | B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests Research Papers Mouxiang Chen Zhejiang University, Zhongxin Liu Zhejiang University, He Tao Zhejiang University, Yusu Hong Zhejiang University, David Lo Singapore Management University, Xin Xia Huawei, JianLing Sun Zhejiang University | ||
| 10:4515m Talk | Reducing Test Runtime by Transforming Test Fixtures Research Papers Chengpeng Li University of Texas at Austin, Abdelrahman Baz The University of Texas at Austin, August Shi The University of Texas at Austin | ||
| 11:0015m Talk | Efficient Incremental Code Coverage Analysis for Regression Test Suites Research Papers | ||
| 11:1515m Talk | Combining Coverage and Expert Features with Semantic Representation for Coincidental Correctness Detection Research Papers Huan Xie Chongqing University, Yan Lei Chongqing University, Maojin Li Chongqing University, Meng Yan Chongqing University, Sheng Zhang Chongqing University | ||
| 11:3015m Talk | A Combinatorial Testing Approach to Surrogate Model Construction Research Papers Sunny Shree The University of Texas at Arlington, Krishna Khadka The University of Texas at Arlington, Jeff Yu Lei University of Texas at Arlington, Raghu Kacker National Institute of Standards and Technology, D. Richard Kuhn National Institute of Standards and Technology | ||
| 11:4515m Talk | The Importance of Accounting for Execution Failures when Predicting Test Flakiness Industry Showcase Guillaume Haben University of Luxembourg, Sarra Habchi Ubisoft Montréal, John Micco VMware, Mark Harman Meta Platforms, Inc. and UCL, Mike Papadakis University of Luxembourg, Maxime Cordy University of Luxembourg, Luxembourg, Yves Le Traon University of Luxembourg, LuxembourgPre-print | ||

