W2GPU: Toward WebAssembly-to-WebGPU Program Translation via Small Language Models
This program is tentative and subject to change.
The proliferation of WebAssembly (WASM) for portable CPU-bound computation, along with language interoperability, and WebGPU for high-performance graphics and compute, presents a new frontier for web applications. However, a significant semantic gap exists between their respective execution models: WASM’s stack-based virtual machine and WebGPU Shading Language’s (WGSL) structured, GPU-centric paradigm. This divergence creates a substantial barrier to single-source development, as conventional compilers struggle with the complex, often non-isomorphic translation task, particularly with the constraints of WGSL, such as the absence of 64-bit indexing. This paper proposes a novel approach to bridge this divide by framing the translation from WebAssembly Text (WAT) to WGSL as a fine-tuning task for small language models (SLMs), and a novel data pipeline to generate and validate a large-scale, parallel dataset of 14,547 (WAT, WGSL) pairs from a GLSL corpus. Our preliminary experiments, focusing on sub-2B parameter models with at least 32K native context, demonstrate the feasibility of this approach. The successful convergence of training loss and end-to-end compilation success with computation result parity on an unseen evaluation split, particularly with the Qwen3 model family, indicates that compact models can learn the intricate syntax and structural mappings from a stack-based to a variable-based representation, including GPU-specific aspects such as vectorization, yet, do not generalize well to longer WAT sequences. These initial findings position neural translation as a promising direction for heterogeneous web targets.
Oguz (Mehmet Oguz Derin, 𐱅𐰼𐰭𐰆𐰍𐰔) is a software data engineer serving as a technical specification editor for the WebGPU Shading Language at W3C. He has previously published research on volumetric data visualization, distributed processing using GPGPU interfaces, game engine integration, computer graphics, scientific visualization, and digitalization of historical scripts. He has also contributed to projects involving RISC-V development, a dataset for the Universal Dependencies project, and a keyboard for Keyman. In addition, he has experience developing mini-games and related tools.