Detecting Malicious Source Code in PyPI Packages with LLMs: Does RAG Come in Handy? (EASE 2025 - Short Papers, Emerging Results)

Who

Motunrayo Osatohanmen Ibiyo, Thinakone Louangdy, Phuong T. Nguyen, Claudio Di Sipio, Davide Di Ruscio

Track

EASE 2025 Short Papers, Emerging Results

Time Zone

The program is currently displayed in (GMT+03:00) Athens.

Use conference time zone: (GMT+03:00) AthensSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 20 Jun 2025 16:40 - 16:50 at Workshop Room - Security Chair(s): Ayse Tosun

Abstract

Malicious software packages in open-source ecosystems, such as PyPI, pose growing security risks. Unlike traditional vulnerabilities, these packages are intentionally designed to deceive users, making detection challenging due to evolving attack methods and the lack of structured datasets. In this work, we empirically evaluate the effectiveness of Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and few-shot learning for detecting malicious source code. We fine-tune LLMs on curated datasets and integrate YARA rules, GitHub Security Advisories, and malicious code snippets with the aim of enhancing classification accuracy. We came across a counterintuitive outcome: While RAG is expected to boost up the prediction performance, it fails in the performed evaluation, obtaining a mediocre accuracy. In contrast, few-shot learning is more effective as it significantly improves the detection of malicious code, achieving 97% accuracy and 95% balanced accuracy, outperforming traditional RAG approaches. Thus, future work should expand structured knowledge bases, refine retrieval models, and explore hybrid AI-driven cybersecurity solutions.

Link to Preprint

https://arxiv.org/abs/2504.13769

Motunrayo Osatohanmen Ibiyo

University of L'Aquila

Italy

Thinakone Louangdy

University of L'Aquila

Italy

Phuong T. Nguyen

University of L’Aquila

Italy

Claudio Di Sipio

University of l'Aquila

Italy

Davide Di Ruscio

University of L'Aquila

Italy

Time Zone

The program is currently displayed in (GMT+03:00) Athens.

Use conference time zone: (GMT+03:00) AthensSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 20 Jun
Displayed time zone: Athens change

15:30 - 17:00	SecurityPosters and Vision / AI Models / Data / Research Papers / Short Papers, Emerging Results at Workshop Room Chair(s): Ayse Tosun Istanbul Technical University

15:30 15m Talk		Leveraging GPT-4 for Vulnerability-Witnessing Unit Test Generation AI Models / Data Gabor Antal FrontEndART Software Ltd., University of Szeged, Dénes Bán University of Szeged, Martin Isztin University of Szeged, Rudolf Ferenc University of Szeged, Peter Hegedus University of Szeged
15:45 15m Talk		SecCityVR: Visualization and Collaborative Exploration of Software Vulnerabilities in Virtual Reality Research Papers Dennis Wüppelmann Paderborn University, Enes Yigitbas Paderborn University Pre-print
16:00 15m Talk		Targeted Fuzzing for Unsafe Rust Code: Leveraging Selective Instrumentation Research Papers David Paaßen University of Duisburg-Essen, Jens-Rene Giesen University of Duisburg-Essen, Lucas Davi University of Duisburg-Essen Pre-print
16:15 15m Talk		There are More Fish in the Sea: An Empirical Study on Automated Vulnerability Repair via Binary Templates Research Papers Bo Lin National University of Defense Technology, Shangwen Wang National University of Defense Technology, Shencong Zeng Phytium Technology Co., Ltd., Liqian Chen National University of Defense Technology, Xiaoguang Mao National University of Defense Technology Pre-print
16:30 10m Talk		Validation Framework for E-Contract and Smart Contract Posters and Vision Sangharatna Godboley NIT Warangal, P. Radha Krishna National Institute of Technology Warangal, Sunkara Sri Harika National Institute of Technology Warangal, India, Pooja Varnam National Institute of Technology Warangal, India Pre-print
16:40 10m Talk		Detecting Malicious Source Code in PyPI Packages with LLMs: Does RAG Come in Handy? Short Papers, Emerging Results Motunrayo Osatohanmen Ibiyo University of L'Aquila, Thinakone Louangdy University of L'Aquila, Phuong T. Nguyen University of L’Aquila, Claudio Di Sipio University of l'Aquila, Davide Di Ruscio University of L'Aquila Pre-print
16:50 10m Talk		ThreMoLIA: Threat Modeling of Large Language Model-Integrated Applications Posters and Vision Felix Viktor Jedrzejewski Blekinge Institute of Technology, Davide Fucci Blekinge Institute of Technology, Oleksandr Adamov Blekinge Institute of Technology Pre-print