Training Data Debugging for the Fairness of Machine Learning Software (ICSE 2022 - Technical Track)

Write a Blog >>

Sun 8 - Fri 27 May 2022

Who

Yanhui Li, Linghan Meng, Lin Chen, Li Yu, Di Wu, Yuming Zhou, Baowen Xu

Track

ICSE 2022 Technical Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 9 May 2022 20:25 - 20:30 at ICSE room 1 - Machine Learning with and for SE 4 Chair(s): Gias Uddin
Fri 13 May 2022 04:05 - 04:10 at ICSE room 4 - Software Fairness Chair(s): Aldeida Aleti

Abstract

With the widespread application of machine learning (ML) software, especially in high-risk tasks, the concern about their unfairness has been raised towards both developers and users of ML software. The unfairness of ML software indicates the software behavior affected by the sensitive features (e.g., sex), which leads to biased and illegal decisions and has become a worthy problem for the whole software engineering community.

According to the “data-driven” programming paradigm of ML software, we consider the root cause of the unfairness as biased features in training data. Inspired by software debugging, we propose a novel method, \textbf{L}inear-regression based \textbf{T}raining \textbf{D}ata \textbf{D}ebugging (LTDD), to \textbf{debug} feature values in training data, i.e., (a) identify which features and which parts of them are biased, and (b) exclude the biased parts of such features to recover as much valuable and unbiased information as possible to build fair ML software. We conduct an extensive study on nine data sets and three classifiers to evaluate the effect of our method LTDD compared with four baseline methods. Experimental results show that (a) LTDD can better improve the fairness of ML software with less or comparable damage to the performance, and (b) LTDD is more actionable for fairness improvement in realistic scenarios.

Link to Preprint

https://github.com/fairnesstest/LTDD/blob/main/ICSE2022.pdf

Yanhui Li

Department of Computer Science and Technology, Nanjing University

China

Linghan Meng

Nanjing University

Lin Chen

Department of Computer Science and Technology, Nanjing University

Li Yu

Nanjing University

China

Di Wu

Momenta

Yuming Zhou

Nanjing University

Baowen Xu

Nanjing University

Training Data Debugging for the Fairness of Machine Learning Software

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 9 May
Displayed time zone: Eastern Time (US & Canada) change

20:00 - 21:00	Machine Learning with and for SE 4Journal-First Papers / Technical Track / SEIP - Software Engineering in Practice at ICSE room 1 Chair(s): Gias Uddin University of Calgary, Canada

5m Talk		Revisiting Process versus Product Metrics: a Large Scale Analysi Journal-First Papers Suvodeep Majumder North Carolina State University, Pranav Mody North Carolina State University, Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached File Attached
5m Talk		Learning to Recognize Actionable Static Code Warnings (is Intrinsically Easy) Journal-First Papers Xueqi Yang NCSU, Jianfeng Chen North Carolina State University, Rahul Yedida North Carolina State University, Zhe Yu , Tim Menzies North Carolina State University Link to publication DOI Pre-print Media Attached
5m Talk		Mining Root Cause Knowledge from Cloud Service Incident Investigations for AIOps SEIP - Software Engineering in Practice Amrita Saha Salesforce Research Asia, Steven C.H. Hoi Salesforce Research Asia Pre-print Media Attached
5m Talk		FairNeuron: Improving Deep Neural Network Fairness with Adversary Games on Selective Neurons Technical Track Xuanqi Gao Xi'an Jiaotong University, Juan Zhai Rutgers University, Shiqing Ma Rutgers University, Chao Shen Xi'an Jiaotong University, Yufei Chen Xi'an Jiaotong University, Qian Wang Wuhan University DOI Pre-print Media Attached
5m Talk		EREBA: Black-box Energy Testing of Adaptive Neural Networks Technical Track Mirazul Haque UT Dallas, Yaswanth Yadlapalli University of Texas at Dallas, Wei Yang University of Texas at Dallas, Cong Liu University of Texas at Dallas, USA Pre-print Media Attached
5m Talk		Training Data Debugging for the Fairness of Machine Learning Software Technical Track Yanhui Li Department of Computer Science and Technology, Nanjing University, Linghan Meng Nanjing University, Lin Chen Department of Computer Science and Technology, Nanjing University, Li Yu Nanjing University, Di Wu Momenta, Yuming Zhou Nanjing University, Baowen Xu Nanjing University Pre-print Media Attached

Fri 13 May
Displayed time zone: Eastern Time (US & Canada) change

04:00 - 05:00	Software FairnessTechnical Track at ICSE room 4 Chair(s): Aldeida Aleti Monash University

5m Talk		FairNeuron: Improving Deep Neural Network Fairness with Adversary Games on Selective Neurons Technical Track Xuanqi Gao Xi'an Jiaotong University, Juan Zhai Rutgers University, Shiqing Ma Rutgers University, Chao Shen Xi'an Jiaotong University, Yufei Chen Xi'an Jiaotong University, Qian Wang Wuhan University DOI Pre-print Media Attached
5m Talk		Training Data Debugging for the Fairness of Machine Learning Software Technical Track Yanhui Li Department of Computer Science and Technology, Nanjing University, Linghan Meng Nanjing University, Lin Chen Department of Computer Science and Technology, Nanjing University, Li Yu Nanjing University, Di Wu Momenta, Yuming Zhou Nanjing University, Baowen Xu Nanjing University Pre-print Media Attached
5m Talk		NeuronFair: Interpretable White-Box Fairness Testing through Biased Neuron Identification Technical Track haibin zheng Zhejiang University of Technology, Zhiqing Chen Zhejiang University of Technology, Tianyu Du Zhejiang University, Xuhong Zhang Zhejiang University, Yao Cheng Huawei International, Shouling Ji Zhejiang University, Jingyi Wang Zhejiang University, Yue Yu College of Computer, National University of Defense Technology, Changsha 410073, China, Jinyin Chen College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China DOI Pre-print Media Attached
5m Talk		Explanation-Guided Fairness Testing through Genetic Algorithm Technical Track Ming Fan Xi'an Jiaotong University, Wenying Wei Xi'an Jiaotong University, Wuxia Jin Xi'an Jiaotong University, Zijiang Yang Western Michigan University, Ting Liu Xi'an Jiaotong University DOI Pre-print

Information for Participants

Mon 9 May 2022 20:00 - 21:00 at ICSE room 1 - Machine Learning with and for SE 4 Chair(s): Gias Uddin

Info for room ICSE room 1-even hours:

Click here to go to the room on Midspace

Fri 13 May 2022 04:00 - 05:00 at ICSE room 4 - Software Fairness Chair(s): Aldeida Aleti

Info for room ICSE room 4-even hours:

Click here to go to the room on Midspace