Paper Reading AI Learner

1-Diffractor: Efficient and Utility-Preserving Text Obfuscation Leveraging Word-Level Metric Differential Privacy

2024-05-02 19:07:32
Stephen Meisenbacher, Maulik Chevli, Florian Matthes

Abstract

The study of privacy-preserving Natural Language Processing (NLP) has gained rising attention in recent years. One promising avenue studies the integration of Differential Privacy in NLP, which has brought about innovative methods in a variety of application settings. Of particular note are $\textit{word-level Metric Local Differential Privacy (MLDP)}$ mechanisms, which work to obfuscate potentially sensitive input text by performing word-by-word $\textit{perturbations}$. Although these methods have shown promising results in empirical tests, there are two major drawbacks: (1) the inevitable loss of utility due to addition of noise, and (2) the computational expensiveness of running these mechanisms on high-dimensional word embeddings. In this work, we aim to address these challenges by proposing $\texttt{1-Diffractor}$, a new mechanism that boasts high speedups in comparison to previous mechanisms, while still demonstrating strong utility- and privacy-preserving capabilities. We evaluate $\texttt{1-Diffractor}$ for utility on several NLP tasks, for theoretical and task-based privacy, and for efficiency in terms of speed and memory. $\texttt{1-Diffractor}$ shows significant improvements in efficiency, while still maintaining competitive utility and privacy scores across all conducted comparative tests against previous MLDP mechanisms. Our code is made available at: this https URL.

Abstract (translated)

近年来,对隐私保护的自然语言处理(NLP)的研究受到了越来越多的关注。一个有前景的研究方向是研究在NLP中整合差分隐私(DP),为各种应用场景带来了创新的方法。尤其值得注意的是单词级别的差分隐私(MLDP)机制,通过逐词对输入文本进行扰动来模糊可能敏感的输入文本。尽管这些方法在实证测试中显示出良好的效果,但有两个主要缺点:(1)由于加入噪声而导致的必然的效用损失,以及(2)在 high-dimensional 单词嵌入上运行这些机制的计算开销。在本文中,我们试图通过提出 1-Diffractor,一种在比较前机制速度更快但仍然具有强大的效用和隐私保护能力的新机制,来解决这些挑战。我们对 1-Diffractor 在多个 NLP 任务上的效用进行了评估,以及基于理论和任务隐私的评估,同时还评估了速度和内存方面的效率。与前 MLDP 机制相比,1-Diffractor 在效率上表现出显著的改进,同时保持了竞争的效用和隐私得分。我们的代码可在此处下载:https:// this URL。

URL

https://arxiv.org/abs/2405.01678

PDF

https://arxiv.org/pdf/2405.01678.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot