RUG: Turbo LLM for Rust Unit Test Generation (ICSE 2025 - Research Track)

Who

Xiang Cheng, Fan Sang, Yizhuo Zhai, Xiaokuan Zhang, Taesoo Kim

Track

ICSE 2025 Research Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 2 May 2025 16:45 - 17:00 at 215 - SE for AI with Quality 3 Chair(s): Sumon Biswas

Abstract

Unit testing improves software quality by evaluating isolated sections of the program. This approach alleviates the need for comprehensive program-wide testing and confines the potential error scope within the software. However, unit test development is time-consuming, requiring developers to create appropriate test contexts and determine input values to cover different code regions. This problem is particularly pronounced in Rust due to its intricate type system, making traditional unit test generation tools ineffective in Rust projects. Recently, LLM have demonstrated their proficiency in understanding programming language and completing software engineering tasks. However, merely prompting LLM with a basic prompt like “generate unit test for the following source code” often results in code with compilation errors. In addition, LLM-generated unit tests often have limited test coverage.

To bridge this gap and harness the capabilities of LLM, we design and implement RUG, an end-to-end solution to automatically generate the unit test for Rust projects. To help LLM’s generated test pass Rust strict compilation checks, RUG designs a semantic-aware bottom-up approach to divide the context construction problem into dependent sub-problems. It solves these sub-problems sequentially using an LLM and merges them to a complete context. To increase test coverage, RUG integrates coverage-guided fuzzing with LLM to prepare fuzzing harnesses. Applying RUG on 17 real-world Rust programs (average 24,937 LoC), we show that RUG can achieve a high code coverage, up to 71.37%, closely comparable to human effort (73.18%). We submitted 113 unit tests generated by RUG covering the new code: 53 of them have been accepted, 17 were rejected, and 43 are pending for review.

Link to Preprint

https://taesoo.kim/pubs/2025/cheng:rug.pdf

File attachments

slide (icse-rug.pptx)	18.98MiB

Xiang Cheng

Georgia Institute of Technology

Fan Sang

Georgia Institute of Technology

Yizhuo Zhai

Georgia Institute of Technology

Xiaokuan Zhang

George Mason University

Taesoo Kim

Georgia Institute of Technology

United States

RUG Repo

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 2 May
Displayed time zone: Eastern Time (US & Canada) change

16:00 - 17:30	SE for AI with Quality 3Research Track / SE In Practice (SEIP) at 215 Chair(s): Sumon Biswas Case Western Reserve University

16:00 15m Talk		Improved Detection and Diagnosis of Faults in Deep Neural Networks Using Hierarchical and Explainable ClassificationSE for AI Research Track Sigma Jahan Dalhousie University, Mehil Shah Dalhousie University, Parvez Mahbub Dalhousie University, Masud Rahman Dalhousie University Pre-print
16:15 15m Talk		Lightweight Concolic Testing via Path-Condition Synthesis for Deep Learning LibrariesSE for AI Research Track Sehoon Kim , Yonghyeon Kim UNIST, Dahyeon Park UNIST, Yuseok Jeon UNIST, Jooyong Yi UNIST, Mijung Kim UNIST
16:30 15m Talk		Mock Deep Testing: Toward Separate Development of Data and Models for Deep LearningSE for AI Research Track Ruchira Manke Tulane University, USA, Mohammad Wardat Oakland University, USA, Foutse Khomh Polytechnique Montréal, Hridesh Rajan Tulane University
16:45 15m Talk		RUG: Turbo LLM for Rust Unit Test GenerationSE for AI Research Track Xiang Cheng Georgia Institute of Technology, Fan Sang Georgia Institute of Technology, Yizhuo Zhai Georgia Institute of Technology, Xiaokuan Zhang George Mason University, Taesoo Kim Georgia Institute of Technology Pre-print Media Attached File Attached
17:00 15m Talk		Test Input Validation for Vision-based DL Systems: An Active Learning ApproachSE for AI SE In Practice (SEIP) Delaram Ghobari University of Ottawa, Mohammad Hossein Amini University of Ottawa, Dai Quoc Tran SmartInsideAI Company Ltd. and Sungkyunkwan University, Seunghee Park SmartInsideAI Company Ltd. and Sungkyunkwan University, Shiva Nejati University of Ottawa, Mehrdad Sabetzadeh University of Ottawa Pre-print
17:15 15m Talk		SEMANTIC CODE FINDER: An Efficient Semantic Search Framework for Large-Scale Codebases SE In Practice (SEIP) daeha ryu Innovation Center, Samsung Electronics, Seokjun Ko Samsung Electronics Co., Eunbi Jang Innovation Center, Samsung Electronics, jinyoung park Innovation Center, Samsung Electronics, myunggwan kim Innovation Center, Samsung Electronics, changseo park Innovation Center, Samsung Electronics