(Partial) Bad Snakes: Understanding and Improving Python Package Index Malware Scanning (accepted at ICSE 2023)
While attackers often distribute malware to victims via open-source, community-driven package repositories, these repositories do not currently run automated malware detection systems. In this work, we evaluate existing malware detection techniques for deployment in this setting by creating a benchmark dataset and comparing several existing tools, including the malware checks implemented in PyPI, Bandit4Mal, and OSSGadget’s OSS Detect Backdoor.
Measured tools have false positive rates between 15% and 97%; increasing thresholds for detection rules to reduce this rate renders the true positive rate useless. In some cases, these checks emitted alerts more often for benign packages than malicious ones.
The purpose of this artifact is provide other researchers our code to experiment with Python malware detection tools.
We aim to achieve the Artifacts Available and Reproduced badges. In order to use this artifact, we assume the reviewers to have a basic knowledge in Python programming. The artifacts can be run using Jupyer notebook or a local Python interpreter.
Our artifact is available at https://zenodo.org/record/7578941#.Y9UNfXZByUk