Detecting Adversarial Prompted AI-Generated Code on Stack Overflow: A Benchmark Dataset and an Enhanced Detection Approach.
This program is tentative and subject to change.
AI-generated code has become an integral part of the mainstream developer workflow today. However, in community-driven platforms like Stack Overflow (SO), where trust, authorship, and credibility are important, AI-generated content can raise concerns around AI plagiarism. While some recent studies have focused on detecting AI-generated code, they have mostly worked with large code samples from standard repositories and programming competitions. In contrast, code snippets on SO are often small and context-specific, making it much difficult for detection. Moreover, another aspect overlooked in prior studies concerns recognizing adversarially prompted AI-generated code deliberately crafted to resemble human-written code. To address these limitations, we have first introduced a large-scale dataset comprising 3500 pairs of SO and ChatGPT answers, along with a curated set of 4500 adversarially prompted AI responses. Next, we evaluate existing code language models over this newly curated dataset. Our evaluation shows that existing models perform well on standard AI answers but fail to detect adversarial ones. Finally, to improve detection, we propose an ensemble approach combining stylometric features of code along with the code embeddings. Our approach shows consistent improvements across multiple models and improves generalization to adversarial prompted code. We release our full dataset to facilitate further research in this area.
This program is tentative and subject to change.
Thu 11 SepDisplayed time zone: Auckland, Wellington change
15:30 - 17:00 | Session 12 - Security 1NIER Track / Research Papers Track / Tool Demonstration Track / Journal First Track at Room TBD2 | ||
15:30 15m | Retrieve, Refine, or Both? Using Task-Specific Guidelines for Secure Python Code Generation Research Papers Track Catherine Tony Hamburg University of Technology, Emanuele Iannone Hamburg University of Technology, Riccardo Scandariato Hamburg University of Technology Pre-print | ||
15:45 15m | SAEL: Leveraging Large Language Models with Adaptive Mixture-of-Experts for Smart Contract Vulnerability Detection Research Papers Track Lei Yu Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences, China, Shiqi Cheng Institute of Software, Chinese Academy of Sciences, China, Zhirong Huang Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences, China, Jingyuan Zhang Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences, China, Chenjie Shen Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences, China, Junyi Lu Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences, China, Li Yang Institute of Software, Chinese Academy of Sciences, Fengjun Zhang Institute of Software, Chinese Academy of Sciences, China, Jiajia Ma Institute of Software, Chinese Academy of Sciences, China | ||
16:00 15m | Evaluating the maintainability of Forward-Porting vulnerabilities in fuzzer benchmarks Research Papers Track Timothée Riom Umeå Universitet, Sabine Houy Umeå Universitet, Bruno Kreyssig Umeå University, Alexandre Bartel Umeå University | ||
16:15 10m | VulGuard: An Unified Tool for Evaluating Just-In-Time Vulnerability Prediction Models Tool Demonstration Track Duong Nguyen Hanoi University of Science and Technology, Manh Tran-Duc Hanoi University of Science and Technology, Le-Cong Thanh The University of Melbourne, Triet Le The University of Adelaide, Muhammad Ali Babar School of Computer Science, The University of Adelaide, Quyet Thang Huynh Hanoi University of Science and Technology | ||
16:25 10m | Explicit Vulnerability Generation with LLMs: An Investigation Beyond Adversarial Attacks NIER Track Emir Bosnak Bilkent University, Sahand Moslemi Yengejeh Bilkent University, Mayasah Lami Bilkent University, Anil Koyuncu Bilkent University Pre-print | ||
16:35 10m | Detecting Adversarial Prompted AI-Generated Code on Stack Overflow: A Benchmark Dataset and an Enhanced Detection Approach. NIER Track Aman Swaraj Dept. of Computer Science & Engineering, Indian Institute of Technology, Roorkee, India, Krishna Agarwal Dept. of Computer Science & Engineering, Indian Institute of Technology, Roorkee, India, Atharv Joshi Indian Institute of Technology Roorkee, Sandeep Kumar Dept. of Computer Science & Engineering, Indian Institute of Technology, Roorkee, India | ||
16:45 15m | Vulnerabilities in Infrastructure as Code: What, How Many, and Who? Journal First Track Aïcha War University of Luxembourg, Alioune Diallo University of Luxembourg, Andrew Habib ABB Corporate Research, Germany, Jacques Klein University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg |