WebAssembly is increasingly being used as a portable compilation target for high-level programming lan- guages. However, the current datasets of WebAssembly programs only include binaries without their corresponding source code. Having the source code available along with the binaries would be helpful for tool writers, and it would enable in-depth program analyses. We mined GitHub and collected 2540 C and C++ projects that are highly-related to WebAssembly. From these projects, we extracted a dataset of 8915 binaries that belong to 572 projects and linked WebAssembly binaries to their source code. To demonstrate an application of this dataset, we investigated the presence of eight WebAssembly compilation smells in a subset of these projects. We deployed Wasmizer, a tool that regularly mines GitHub projects and makes an up-to- date dataset of WebAssembly sources and their binaries publicly available