Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers (FSE 2025 - Research Papers)

Who

Lam Nguyen Tung, Steven Cho, Xiaoning Du, Neelofar Neelofar, Valerio Terragni, Stefano Ruberto, Aldeida Aleti

Track

FSE 2025 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 25 Jun 2025 11:20 - 11:40 at Cosmos Hall - SE and AI 2 Chair(s): Massimiliano Di Penta

Abstract

Machine learning (ML) for text classification has been widely used in various domains, such as toxicity detection, chatbot consulting, and review analysis. These applications can significantly impact ethics, economics, and human behavior, raising serious concerns about trusting ML decisions. Several studies indicate that traditional uncertainty metrics, such as model confidence, and performance metrics, like accuracy, are insufficient to build human trust in ML models. These models often learn spurious correlations during training and predict based on them during inference. When deployed in the real world, where such correlations are absent, their performance can deteriorate significantly. To avoid this, a common practice is to test whether predictions are made reasonably based on valid patterns in the data, Along with this, a challenge known as the trustworthiness oracle problem has been introduced. So far, due to the lack of automated trustworthiness oracles, the assessment requires manual validation, based on the decision process disclosed by explanation methods. However, this approach is time-consuming, error-prone, and not scalable.

To address this problem, we propose TOKI, the first automated trustworthiness oracle generation method for text classifiers. TOKI automatically checks whether the words contributing the most to a prediction are semantically related to the predicted class. Specifically, we leverage ML explanation methods to extract the decision-contributing words and measure their semantic relatedness with the class based on word embeddings. As a demonstration of its practical usefulness, we also introduce a novel adversarial attack method that targets trustworthiness vulnerabilities identified by TOKI. We compare TOKI with a naive baseline based solely on model confidence. To evaluate their alignment with human judgement, experiments are conducted on human-created ground truths of approximately 6,000 predictions. Additionally, we compare the effectiveness of TOKI-guided adversarial attack method with A2T, a state-of-the-art adversarial attack method for text classification. Results show that (1) relying on prediction uncertainty metrics, such as model confidence, cannot effectively distinguish between trustworthy and untrustworthy predictions, (2) TOKI achieves 142% higher accuracy than the naive baseline, and (3) TOKI-guided adversarial attack method is more effective with fewer perturbations than A2T.

DOI

https://doi.org/10.1145/3729376

File attachments

Presentation (FSE_SEandAI2_1120_LamNguyenTung_Automated.pptx)	9.79MiB

Lam Nguyen Tung

Monash University, Australia

Australia

Steven Cho

The University of Auckland, New Zealand

Xiaoning Du

Monash University

Australia

Neelofar Neelofar

Royal Melbourne Institure of Techonlogy (RMIT)

Australia

Valerio Terragni

University of Auckland

New Zealand

Stefano Ruberto

JRC European Commission

Italy

Aldeida Aleti

Monash University

Australia

Code

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 25 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

11:00 - 12:30	SE and AI 2Ideas, Visions and Reflections / Research Papers at Cosmos Hall Chair(s): Massimiliano Di Penta University of Sannio, Italy

11:00 20m Talk		Beyond PEFT: Layer-Wise Optimization for More Effective and Efficient Large Code Model Tuning Research Papers Chaozheng Wang The Chinese University of Hong Kong, jiafeng University of Electronic Science and Technology of China, Shuzheng Gao Chinese University of Hong Kong, Cuiyun Gao Harbin Institute of Technology, Shenzhen, Li Zongjie Hong Kong University of Science and Technology, Ting Peng Tencent Inc., Hailiang Huang Tencent Inc., Yuetang Deng Tencent, Michael Lyu Chinese University of Hong Kong DOI
11:20 20m Talk		Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers Research Papers Lam Nguyen Tung Monash University, Australia, Steven Cho The University of Auckland, New Zealand, Xiaoning Du Monash University, Neelofar Neelofar Royal Melbourne Institure of Techonlogy (RMIT), Valerio Terragni University of Auckland, Stefano Ruberto JRC European Commission, Aldeida Aleti Monash University DOI Media Attached File Attached
11:40 20m Talk		A Causal Learning Framework for Enhancing Robustness of Source Code Models Research Papers Junyao Ye Huazhong University of Science and Technology, Zhen Li Huazhong University of Science and Technology, Xi Tang Huazhong University of Science and Technology, Deqing Zou Huazhong University of Science and Technology, Shouhuai Xu University of Colorado Colorado Springs, Qiang Weizhong Huazhong University of Science and Technology, Hai Jin Huazhong University of Science and Technology DOI
12:00 20m Talk		Eliminating Backdoors in Neural Code Models for Secure Code Understanding Research Papers Weisong Sun Nanjing University, Yuchen Chen Nanjing University, Chunrong Fang Nanjing University, Yebo Feng Nanyang Technological University, Yuan Xiao Nanjing University, An Guo Nanjing University, Quanjun Zhang School of Computer Science and Engineering, Nanjing University of Science and Technology, Zhenyu Chen Nanjing University, Baowen Xu Nanjing University, Yang Liu Nanyang Technological University DOI
12:20 10m Talk		Reduction Fusion for Optimized Distributed Data-Parallel Computations via Inverse Recomputation Ideas, Visions and Reflections Haoxiang Lin Microsoft Research, Yang Wang Microsoft Research Asia, Yanjie Gao Microsoft Research, Hongyu Zhang Chongqing University, Ming Wu Zero Gravity Labs, Mao Yang Microsoft Research DOI Pre-print

Information for Participants

Wed 25 Jun 2025 11:00 - 12:30 at Cosmos Hall - SE and AI 2 Chair(s): Massimiliano Di Penta

Info for room Cosmos Hall:

This is the main event hall of Clarion Hotel, which will be used to host keynote talks and other plenary sessions. The FSE and ISSTA banquets will also happen in this room.

The room is just in front of the registration desk, on the other side of the main conference area. The large doors with numbers “1” and “2” provide access to the Cosmos Hall.