Detecting Adversarial Prompted AI-Generated Code on Stack Overflow: A Benchmark Dataset and an Enhanced Detection Approach. (ICSME 2025 - New Ideas and Emerging Results Track) - ICSME 2025 - International Conference on Software Maintenance and Evolution

Sun 7 - Fri 12 September 2025 Auckland, New Zealand

Who

Aman Swaraj, Krishna Agarwal, Atharv Joshi, Sandeep Kumar

Track

ICSME 2025 NIER Track

Abstract

AI-generated code has become an integral part of the mainstream developer workflow today. However, in community-driven platforms like Stack Overflow (SO), where trust, authorship, and credibility are important, AI-generated content can raise concerns around AI plagiarism. While some recent studies have focused on detecting AI-generated code, they have mostly worked with large code samples from standard repositories and programming competitions. In contrast, code snippets on SO are often small and context-specific, making it much difficult for detection. Moreover, another aspect overlooked in prior studies concerns recognizing adversarially prompted AI-generated code deliberately crafted to resemble human-written code. To address these limitations, we have first introduced a large-scale dataset comprising 3500 pairs of SO and ChatGPT answers, along with a curated set of 4500 adversarially prompted AI responses. Next, we evaluate existing code language models over this newly curated dataset. Our evaluation shows that existing models perform well on standard AI answers but fail to detect adversarial ones. Finally, to improve detection, we propose an ensemble approach combining stylometric features of code along with the code embeddings. Our approach shows consistent improvements across multiple models and improves generalization to adversarial prompted code. We release our full dataset to facilitate further research in this area.

File attachments

(ICSME Presentation.mp4)	14.77MiB

Aman Swaraj

Detecting Adversarial Prompted AI-Generated Code on Stack Overflow: A Benchmark Dataset and an Enhanced Detection Approach.

Aman Swaraj

Dept. of Computer Science & Engineering, Indian Institute of Technology, Roorkee, India

India

Krishna Agarwal

Dept. of Computer Science & Engineering, Indian Institute of Technology, Roorkee, India

India

Atharv Joshi

Indian Institute of Technology Roorkee

India

Sandeep Kumar

Dept. of Computer Science & Engineering, Indian Institute of Technology, Roorkee, India

India

Tracks

Co-hosted Conferences