Abstract
Conditional Generative Adversarial Networks (CGANs) exhibit significant potential in supervised learning model training by virtue of their ability to generate realistic labeled images. However, numerous studies have indicated the privacy leakage risk in CGANs models. The solution DPCGAN, incorporating the differential privacy framework, faces challenges such as heavy reliance on labeled data for model training and potential disruptions to original gradient information due to excessive gradient clipping, making it difficult to ensure model accuracy. To address these challenges, we present a privacy-preserving training framework called PATE-TripleGAN. This framework incorporates a classifier to pre-classify unlabeled data, establishing a three-party min-max game to reduce dependence on labeled data. Furthermore, we present a hybrid gradient desensitization algorithm based on the Private Aggregation of Teacher Ensembles (PATE) framework and Differential Private Stochastic Gradient Descent (DPSGD) method. This algorithm allows the model to retain gradient information more effectively while ensuring privacy protection, thereby enhancing the model's utility. Privacy analysis and extensive experiments affirm that the PATE-TripleGAN model can generate a higher quality labeled image dataset while ensuring the privacy of the training data.
Abstract (translated)
条件生成对抗网络(CGANs)在监督学习模型训练方面的潜在优势在于其生成真实标注图像的能力。然而,许多研究表明,CGAN模型的隐私泄露风险。为解决这些挑战,我们提出了一个名为PATE-TripleGAN的隐私保护训练框架。该框架引入了一个分类器来预先分类未标注数据,建立了一个三方最小最大游戏以减少对标注数据的依赖。此外,我们还提出了一个基于Private Aggregation of Teacher Ensembles(PATE)框架和Differential Private Stochastic Gradient Descent(DPSGD)方法的混合梯度缓解算法。该算法可以在保护隐私的同时确保模型具有更好的利用价值。隐私分析和广泛的实验结果证实,在保护训练数据隐私的情况下,PATE-TripleGAN模型可以生成更高质量的标注图像数据。
URL
https://arxiv.org/abs/2404.12730