Paper Reading AI Learner

How Far Can It Go?: On Intrinsic Gender Bias Mitigation for Text Classification

2023-01-30 13:05:48
Ewoenam Tokpo, Pieter Delobelle, Bettina Berendt, Toon Calders

Abstract

To mitigate gender bias in contextualized language models, different intrinsic mitigation strategies have been proposed, alongside many bias metrics. Considering that the end use of these language models is for downstream tasks like text classification, it is important to understand how these intrinsic bias mitigation strategies actually translate to fairness in downstream tasks and the extent of this. In this work, we design a probe to investigate the effects that some of the major intrinsic gender bias mitigation strategies have on downstream text classification tasks. We discover that instead of resolving gender bias, intrinsic mitigation techniques and metrics are able to hide it in such a way that significant gender information is retained in the embeddings. Furthermore, we show that each mitigation technique is able to hide the bias from some of the intrinsic bias measures but not all, and each intrinsic bias measure can be fooled by some mitigation techniques, but not all. We confirm experimentally, that none of the intrinsic mitigation techniques used without any other fairness intervention is able to consistently impact extrinsic bias. We recommend that intrinsic bias mitigation techniques should be combined with other fairness interventions for downstream tasks.

Abstract (translated)

为了减轻上下文化语言模型中的性别偏见,提出了多种固有的减轻策略,并同时提出了许多偏见度量。考虑到这些语言模型最终用途是文本分类等下游任务,理解这些固有的偏见减轻策略如何实际转化为下游任务的公平性以及其程度是至关重要的。在这个研究中,我们设计了一个测试集来研究一些主要固有的性别偏见减轻策略对下游文本分类任务的影响。我们发现,而不是解决性别偏见,固有的减轻技术和度量能够以某种方式掩盖它,从而使嵌入中保留重要的性别信息。此外,我们表明,每个减轻技术能够从某些固有的偏见度量中掩盖偏见,但不是所有的,每个固有的偏见度量能够被某些减轻技术欺骗,但不是所有的。我们实验确认,使用任何其他公平干预都不会使固有的减轻技术能够 consistently 影响外部偏见。我们建议,固有的偏见减轻技术应该与其他公平干预一起用于下游任务。

URL

https://arxiv.org/abs/2301.12855

PDF

https://arxiv.org/pdf/2301.12855.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot