Automating the Detection of Code Vulnerabilities by Analyzing GitHub Issues
In today’s digital landscape, the importance of timely and accurate vulnerability detection has significantly increased. This paper presents a novel approach that leverages transformer-based models and machine learning techniques to automate the identification of software vulnerabilities by analyzing GitHub issues. We introduce a new dataset specifically designed for classifying GitHub issues relevant to vulnerability detection. We then examine various classification techniques to determine their effectiveness. The results demonstrate the potential of this approach for real-world application in early vulnerability detection, which could substantially reduce the window of exploitation for software vulnerabilities. This research makes a key contribution to the field by providing a scalable and computationally efficient framework for automated detection, enabling the prevention of compromised software usage before official notifications. This work has the potential to enhance the security of open-source software ecosystems.
Sat 3 MayDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:30 | |||
11:00 60mKeynote | Keynote 2: Towards Autonomous Language Model Systems (zoom talk) LLM4Code Ofir Press Princeton University | ||
12:00 10mTalk | With a Little Help from My (LLM) Friends: Enhancing Static Analysis with LLMs to Detect Software Vulnerabilities LLM4Code Amy Munson University of California, San Diego, Juanita Gomez University of California, Santa Cruz, Álvaro Cárdenas University of California, Santa Cruz | ||
12:10 10mTalk | Automating the Detection of Code Vulnerabilities by Analyzing GitHub Issues LLM4Code Daniele Cipollone Delft University of Technology, Changjie Wang KTH Royal Institute of Technology, Mariano Scazzariello RISE Research Institutes of Sweden, Simone Ferlin Red Hat, Maliheh Izadi Delft University of Technology, Dejan Kostic KTH Royal Institute of Technology, Marco Chiesa KTH Royal Institute of Technology | ||
12:20 10mTalk | COSMosFL: Ensemble of Small Language Models for Fault Localisation LLM4Code Pre-print |