Classifying Source Code: How Far Can Compressor-based Classifiers Go? (ICSE 2024 - SRC - ACM Student Research Competition) - ICSE 2024

Fri 12 - Sun 21 April 2024 Lisbon, Portugal

Track

ICSE 2024 SRC - ACM Student Research Competition

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Wed 17 Apr 2024 16:00 - 17:30 at Open Space - SRC Posters Chair(s): Mattia Fazzini, André Restivo
Thu 18 Apr 2024 11:12 - 11:24 at Vianna da Motta - SRC Presentations Chair(s): Mattia Fazzini, André Restivo

Abstract

Pre-trained language models of code, which are built upon largescale datasets, millions of trainable parameters, and high computational resources cost, have achieved phenomenal success. Recently, researchers have proposed a compressor-based classifier (Cbc); it trains no parameters but is found to outperform BERT. We conduct the first empirical study to explore whether this lightweight alternative can accurately classify source code. Our study is more than applying Cbc to code-related tasks. We first identify an issue that the original implementation overestimates Cbc. After correction, Cbc’s performance on defect prediction drops from 80.7% to 63.0%, which is still comparable to CodeBERT (63.7%). We find that hyperparameter settings affect the performance. Besides, results show that Cbc can outperform CodeBERT when the training data is small, making it a good alternative in low-resource settings.

Time Zone

The program is currently displayed in (GMT+01:00) Lisbon.

Use conference time zone: (GMT+01:00) LisbonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Wed 17 Apr
Displayed time zone: Lisbon change

	16:00 - 17:30	SRC PostersSRC - ACM Student Research Competition at Open Space Chair(s): Mattia Fazzini University of Minnesota, André Restivo LIACC, Universidade do Porto, Porto, Portugal

	16:00 90m Poster		Program Decomposition and Translation with Static Analysis SRC - ACM Student Research Competition Ali Reza Ibrahimzada University of Illinois Urbana-Champaign DOI Pre-print File Attached
	16:00 90m Poster		IntTracer: Sanitization-aware IO2BO Vulnerability Detection across Codebases SRC - ACM Student Research Competition Xiang Chen Shanghai Jiao Tong University
	16:00 90m Poster		Vulnerability Root Cause Function Locating For Java Vulnerabilities SRC - ACM Student Research Competition Lyuye Zhang Nanyang Technological University
	16:00 90m Poster		Flakiness Repair in the Era of Large Language Models SRC - ACM Student Research Competition Yang Chen University of Illinois at Urbana-Champaign
	16:00 90m Poster		Refining Abstract Specifications into Dangerous Traffic Scenarios SRC - ACM Student Research Competition Aren Babikian McGill University
	16:00 90m Poster		An Ensemble Method for Bug Triaging using Large Language Models SRC - ACM Student Research Competition Atish Kumar Dipongkor University of Central Florida
	16:00 90m Poster		Classifying Source Code: How Far Can Compressor-based Classifiers Go? SRC - ACM Student Research Competition Zhou Yang Singapore Management University

Thu 18 Apr
Displayed time zone: Lisbon change

	11:00 - 12:30	SRC PresentationsSRC - ACM Student Research Competition at Vianna da Motta Chair(s): Mattia Fazzini University of Minnesota, André Restivo LIACC, Universidade do Porto, Porto, Portugal

	11:00 12m Poster		An Ensemble Method for Bug Triaging using Large Language Models SRC - ACM Student Research Competition Atish Kumar Dipongkor University of Central Florida
	11:12 12m Poster		Classifying Source Code: How Far Can Compressor-based Classifiers Go? SRC - ACM Student Research Competition Zhou Yang Singapore Management University
	11:24 12m Poster		Flakiness Repair in the Era of Large Language Models SRC - ACM Student Research Competition Yang Chen University of Illinois at Urbana-Champaign
	11:36 12m Poster		Program Decomposition and Translation with Static Analysis SRC - ACM Student Research Competition Ali Reza Ibrahimzada University of Illinois Urbana-Champaign DOI Pre-print File Attached
	11:48 12m Poster		Refining Abstract Specifications into Dangerous Traffic Scenarios SRC - ACM Student Research Competition Aren Babikian McGill University

Zhou Yang

Singapore Management University