Abstract
Fairness in machine learning (ML) has a critical importance for building trustworthy machine learning system as artificial intelligence (AI) systems increasingly impact various aspects of society, including healthcare decisions and legal judgments. Moreover, numerous studies demonstrate evidence of unfair outcomes in ML and the need for more robust fairness-aware methods. However, the data we use to train and develop debiasing techniques often contains biased and noisy labels. As a result, the label bias in the training data affects model performance and misrepresents the fairness of classifiers during testing. To tackle this problem, our paper presents Graph-based Fairness-aware Label Correction (GFLC), an efficient method for correcting label noise while preserving demographic parity in datasets. In particular, our approach combines three key components: prediction confidence measure, graph-based regularization through Ricci-flow-optimized graph Laplacians, and explicit demographic parity incentives. Our experimental findings show the effectiveness of our proposed approach and show significant improvements in the trade-off between performance and fairness metrics compared to the baseline.
Abstract (translated)
机器学习(ML)中的公平性对于建立值得信赖的机器学习系统至关重要,因为人工智能(AI)系统对社会各个方面的影响力日益增大,包括医疗决策和法律判决。此外,大量研究表明,机器学习中存在不公平的结果,并且需要更强大的能够识别并减轻不公平性的方法。然而,我们用于训练和开发去偏技术的数据通常包含偏差和噪声标签。因此,训练数据中的标签偏差会影响模型性能,并在测试过程中误导分类器的公平性表现。 为了应对这一问题,我们的论文提出了一种基于图的公平感知标签校正(GFLC)方法,这是一种有效的方法,在修正标签噪声的同时保持数据集的人口统计学均衡。具体来说,我们的方法结合了三个关键组成部分:预测置信度测量、通过里奇流优化的图拉普拉斯算子进行的基于图的正则化以及明确的人口统计学均衡激励。 实验结果表明我们提出的方法的有效性,并且在性能和公平性指标之间的权衡上相比基线方法有了显著改进。
URL
https://arxiv.org/abs/2506.15620