When Does Wasm Malware Detection Fail? A Systematic Analysis of Their Robustness to Evasion
WebAssembly (Wasm) provides a language-agnostic compilation target that delivers near-native performance for web applications, yet it also attracts adversaries who exploit Wasm to effectively steal someone else’s computer resources such as cryptojackers. While several detection tools have been proposed, their robustness against perturbations remains largely unknown. In this paper, we introduce SWAMPED (Systematic WebAssembly Module Perturbation Evaluation of Detectors), a framework that incorporates 22 semantics-preserving perturbation methods. SWAMPED generates a total of 48,840 perturbed variants from 43 cryptojacker samples and 31 additional Wasm malware binaries from real-world. We assess detection performance of six detectors: three Wasm-specific ones and three deep neural network (DNN) detectors. We find that DNN-based detectors are vulnerable to perturbations that shift the instruction distribution; profiling-based methods are disrupted by changes in instruction frequency; and semantic-aware approaches are highly sensitive to function-level dependency modifications. DNN-based detectors, which lack Wasm-specific modeling, are particularly susceptible to changes in the spatial layout of Wasm binaries. These findings highlight fundamental limitations in current Wasm malware detection approaches, relying on overly specific detection heuristics and inadequately trained or designed models. We offer suggestions to improve the robustness against perturbations.