Prioritizing Test Smells: An Empirical Evaluation of Quality Metrics and Developer Perceptions (ICSME 2025 - New Ideas and Emerging Results Track) - ICSME 2025 - International Conference on Software Maintenance and Evolution

Who

Md Arif Hasan, Toukir Ahammed

Track

ICSME 2025 NIER Track

Time Zone

The program is currently displayed in (GMT+12:00) Auckland, Wellington.

Use conference time zone: (GMT+12:00) Auckland, WellingtonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 11 Sep 2025 14:10 - 14:20 at Case Room 3 260-055 - Session 9 - Testing 3 Chair(s): Sigrid Eldh

Abstract

Test smells, suboptimal patterns in test code, impair software maintainability and reliability, especially in resource-constrained open-source Python projects. While detection tools such as PyNose identify python-specific test smells, prioritizing them for refactoring remains a challenge due to the lack of test-specific frameworks. This study proposes a metric-driven approach that integrates Change Proneness (CP) and Fault Proneness (FP) metrics, computed via Spearman’s rank correlation, to quantify maintenance and reliability risks across 15 test smells in 52 open-source Python projects. Complementing this, a survey of 45 developers captures subjective severity perceptions. By applying Martin Fowler’s Technical Debt Quadrant, we classify smells based on empirical risk and developer insights into four categories, enabling better prioritization. Out of the 15 analyzed smells, Conditional Test Logic, Duplicate Assert, Obscure In-Line Setup, and Redundant Assertion belong to the highest-priority category for refactoring. These smells are characterized by both high empirical risk and strong developer agreement. This integrated framework advances test smell prioritization by combining data-driven analysis with practitioner perspectives, facilitating efficient refactoring decisions and improved test suite quality.

Md Arif Hasan

University of Dhaka, Bangladesh

Bangladesh

Toukir Ahammed

Institute of Information Technology, University of Dhaka