Paper Reading AI Learner

GFLC: Graph-based Fairness-aware Label Correction for Fair Classification

2025-06-18 16:51:26
Modar Sulaiman, Kallol Roy

Abstract

Fairness in machine learning (ML) has a critical importance for building trustworthy machine learning system as artificial intelligence (AI) systems increasingly impact various aspects of society, including healthcare decisions and legal judgments. Moreover, numerous studies demonstrate evidence of unfair outcomes in ML and the need for more robust fairness-aware methods. However, the data we use to train and develop debiasing techniques often contains biased and noisy labels. As a result, the label bias in the training data affects model performance and misrepresents the fairness of classifiers during testing. To tackle this problem, our paper presents Graph-based Fairness-aware Label Correction (GFLC), an efficient method for correcting label noise while preserving demographic parity in datasets. In particular, our approach combines three key components: prediction confidence measure, graph-based regularization through Ricci-flow-optimized graph Laplacians, and explicit demographic parity incentives. Our experimental findings show the effectiveness of our proposed approach and show significant improvements in the trade-off between performance and fairness metrics compared to the baseline.

Abstract (translated)

机器学习(ML)中的公平性对于建立值得信赖的机器学习系统至关重要,因为人工智能(AI)系统对社会各个方面的影响力日益增大,包括医疗决策和法律判决。此外,大量研究表明,机器学习中存在不公平的结果,并且需要更强大的能够识别并减轻不公平性的方法。然而,我们用于训练和开发去偏技术的数据通常包含偏差和噪声标签。因此,训练数据中的标签偏差会影响模型性能,并在测试过程中误导分类器的公平性表现。 为了应对这一问题,我们的论文提出了一种基于图的公平感知标签校正(GFLC)方法,这是一种有效的方法,在修正标签噪声的同时保持数据集的人口统计学均衡。具体来说,我们的方法结合了三个关键组成部分:预测置信度测量、通过里奇流优化的图拉普拉斯算子进行的基于图的正则化以及明确的人口统计学均衡激励。 实验结果表明我们提出的方法的有效性,并且在性能和公平性指标之间的权衡上相比基线方法有了显著改进。

URL

https://arxiv.org/abs/2506.15620

PDF

https://arxiv.org/pdf/2506.15620.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot