What You See Is Not What You Tap: Detecting Misalignments between Visual and Interactive Boundaries in Mobile Apps
Predictability is a central principle of mobile app interaction design, requiring that an app’s responses align with user expectations. Beyond the well-tested functional misalignment, our industrial practice highlights a frequently violated yet formally unexplored aspect: \textit{visual-interactive misalignment}, a discrepancy between an element’s visual and interactive boundaries. Such misalignments can cause user taps to be ignored or trigger unintended actions, and consequently degrade the overall experience. Due to their dynamic nature during runtime, existing static analysis tools are inherently incapable of detecting these defects. To address this critical gap, we present TouchAlign, a black-box tool that detects spatial misalignment. TouchAlign works by comparing an element’s visual boundary, extracted via object detection, against its interactive boundary, which is mapped out by simulating user taps around the element. Integrated into the software quality assurance pipeline of Meituan App (an on-demand lifestyle platform with over 700 million users), TouchAlign has effectively uncovered critical defects that had eluded conventional testing, thereby helping preserve the predictability of mobile applications.