(Cancelled by the author) Measuring Prediction Sensitivity using Protected Gradients
In this paper we calculate the impact of protected attributes on the final model decision in the form of saliency scores. The geometry of word embeddings can be used to extract the sub-spaces of these attributes such as gender, ethnicity, sexual orientation, etc. We use this orientation to calculate a bias score for each word and phrase from the embeddings space. We demonstrate empirically that in situations when access to the human annotator is restricted, this score might be utilized as a stand-in for the protected property. Furthermore, the directional derivative of the model along the bias direction can be used for fairness testing. This can provide token level sensitivity of the model due to biased content embedded in the word embedding.
Fri 23 FebDisplayed time zone: Chennai, Kolkata, Mumbai, New Delhi change
11:30 - 13:00 | |||
11:30 30mTalk | Product configuration generation from plan documents using document digitization tool Software Engineering in Practice | ||
12:00 30mTalk | (Cancelled by the author) Measuring Prediction Sensitivity using Protected Gradients Software Engineering in Practice Sunil Gopa Wells Fargo | ||
12:30 30mIndustry talk | (Cancelled by the author) Downstream bias mitigation is all you need Software Engineering in Practice Sunil Gopa Wells Fargo |