Semantics-Preserved Distortion for Personal Privacy Protection

2022-01-04 04:01:05

Letian Peng, Zuchao Li, Hai Zhao

arXiv_CL

arXiv_CL Recognition Pose

Abstract
Abstract (translated)
URL
PDF

Abstract

Privacy protection is an important and concerning topic in Federated Learning, especially for Natural Language Processing. In client devices, a large number of texts containing personal information are produced by users every day. As the direct application of information from users is likely to invade personal privacy, many methods have been proposed in Federated Learning to block the center model from the raw information in client devices. In this paper, we try to do this more linguistically via distorting the text while preserving the semantics. In practice, we leverage a recently proposed metric, Neighboring Distribution Divergence, to evaluate the semantic preservation during the distortion. Based on the metric, we propose two frameworks for semantics-preserved distortion, a generative one and a substitutive one. Due to the lack of privacy-related tasks in the current Natural Language Processing field, we conduct experiments on named entity recognition and constituency parsing. Results from our experiments show the plausibility and efficiency of our distortion as a method for personal privacy protection.

Abstract (translated)

URL

https://arxiv.org/abs/2201.00965

PDF

https://arxiv.org/pdf/2201.00965.pdf