Abstract
The worldwide adoption of machine learning (ML) and deep learning models, particularly in critical sectors, such as healthcare and finance, presents substantial challenges in maintaining individual privacy and fairness. These two elements are vital to a trustworthy environment for learning systems. While numerous studies have concentrated on protecting individual privacy through differential privacy (DP) mechanisms, emerging research indicates that differential privacy in machine learning models can unequally impact separate demographic subgroups regarding prediction accuracy. This leads to a fairness concern, and manifests as biased performance. Although the prevailing view is that enhancing privacy intensifies fairness disparities, a smaller, yet significant, subset of research suggests the opposite view. In this article, with extensive evaluation results, we demonstrate that the impact of differential privacy on fairness is not monotonous. Instead, we observe that the accuracy disparity initially grows as more DP noise (enhanced privacy) is added to the ML process, but subsequently diminishes at higher privacy levels with even more noise. Moreover, implementing gradient clipping in the differentially private stochastic gradient descent ML method can mitigate the negative impact of DP noise on fairness. This mitigation is achieved by moderating the disparity growth through a lower clipping threshold.
Abstract (translated)
全球范围内机器学习(ML)和深度学习模型的采用,特别是在关键行业,如医疗保健和金融,对维护个人隐私和公平性提出了实质性的挑战。这两个要素对于一个可信赖的学习环境至关重要。虽然许多研究通过差异隐私(DP)机制保护个人隐私,但新兴研究表示,在机器学习模型中,差异隐私可能平等地影响预测准确性。这导致公平性问题,并表现为偏见表现。尽管普遍的看法是,提高隐私会加剧不公平差异,但较小、但具有重大意义的研究表明,这种看法是错误的。在本文中,我们通过大量评估结果,证明差异隐私对公平性的影响不是单调的。相反,我们观察到,在向ML过程添加更多DP噪声(增强隐私)时,准确度差异 initially增长,但随后在更高隐私水平上减少至更多噪声。此外,在差异隐私随机梯度下降ML方法中实现梯度截断可以减轻DP噪声对公平性的负面影响。这种减轻是通过降低截断阈值来调节差异增长实现的。
URL
https://arxiv.org/abs/2404.09391