Abstract
The right to be forgotten (RTBF) seeks to safeguard individuals from the enduring effects of their historical actions by implementing machine-learning techniques. These techniques facilitate the deletion of previously acquired knowledge without requiring extensive model retraining. However, they often overlook a critical issue: unlearning processes bias. This bias emerges from two main sources: (1) data-level bias, characterized by uneven data removal, and (2) algorithm-level bias, which leads to the contamination of the remaining dataset, thereby degrading model accuracy. In this work, we analyze the causal factors behind the unlearning process and mitigate biases at both data and algorithmic levels. Typically, we introduce an intervention-based approach, where knowledge to forget is erased with a debiased dataset. Besides, we guide the forgetting procedure by leveraging counterfactual examples, as they maintain semantic data consistency without hurting performance on the remaining dataset. Experimental results demonstrate that our method outperforms existing machine unlearning baselines on evaluation metrics.
Abstract (translated)
遗忘权(RTBF)旨在通过实施机器学习技术保护个人免受其历史行动长期持续影响。这些技术通过删除先前获得的未经授权的知识,而无需进行复杂的模型重构。然而,它们通常忽视了一个关键问题:遗忘过程的偏见。这种偏见源于两个主要来源:(1)数据级偏见,表现为数据去除不均衡,(2)算法级偏见,导致剩余数据集被污染,从而降低模型准确性。在这项工作中,我们分析了导致遗忘过程的因素,并消除了数据和算法层面的偏见。通常,我们引入了干预为基础的方法,其中知识被从有偏的数据集中消除。此外,我们通过利用反事实例子来指导遗忘过程,因为它们在不影响剩余数据集性能的情况下保持语义数据一致性。实验结果表明,我们的方法在评估指标上优于现有的机器学习去遗忘基准。
URL
https://arxiv.org/abs/2404.15760