Is This You, LLM? Recognizing AI-written Programs with Multilingual Code Stylometry (SANER 2025 - Research Papers)

Who

Andrea Gurioli, Maurizio Gabbrielli, Stefano Zacchiroli

Track

SANER 2025 Research Papers

Time Zone

The program is currently displayed in (GMT-05:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-05:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 6 Mar 2025 11:45 - 12:00 at M-1410 - Program Analysis Chair(s): Rrezarta Krasniqi

Abstract

With the increasing popularity of LLM-based code completers, like GitHub Copilot, the interest in automatically detecting AI-generated code is also increasing—in particular in contexts where the use of LLMs to program is forbidden by policy due to security, intellectual property, or ethical concerns. We introduce a novel technique for AI code stylometry, i.e., the ability to distinguish code generated by LLMs from code written by humans, based on a transformer-based encoder classifier. Differently from previous work, our classifier is capable of detecting AI-written code across 10 different programming languages with a single machine learning model, maintaining high average accuracy across all languages (84.1% ± 3.8%). Together with the classifier we also release H-AIRosettaMP, a novel open dataset for AI code stylometry tasks, consisting of 121 247 code snippets in 10 popular programming languages, labeled as either human-written or AI-generated. The experimental pipeline (dataset, training code, resulting models) is the first fully reproducible one for the AI code stylometry task. Most notably our experiments rely only on open LLMs, rather than on proprietary/closed ones like ChatGPT.

Link to Preprint

https://arxiv.org/abs/2412.14611

Andrea Gurioli

DISI - University of Bologna

Italy

Maurizio Gabbrielli

DISI - University of Bologna

Italy

Stefano Zacchiroli

Télécom Paris, Polytechnic Institute of Paris

France