Paper Reading AI Learner

MERGE: Fast Private Text Generation

2023-05-25 06:27:19
Zi Liang, Pinghui Wang, Ruofei Zhang, Nuo Xu, Shuo Zhang

Abstract

Recent years have seen increasing concerns about the private inference of NLP services and Transformer models. However, existing two-party privacy-preserving methods solely consider NLU scenarios, while the private inference of text generation such as translation, dialogue, and code completion remains unsolved. Besides, while migrated to NLG models, existing privacy-preserving methods perform poorly in terms of inference speed, and suffer from the convergence problem during the training stage. To address these issues, we propose MERGE, a fast private text generation framework for Transformer-based language models. Specifically, MERGE reuse the output hidden state as the word embedding to bypass the embedding computation, and reorganize the linear operations in the Transformer module to accelerate the forward procedure. Based on these two optimizations, extensive experiments show that MERGE can achieve a 26.5x speedup under the sequence length 512, and reduce 80\% communication bytes, with an up to 10x speedup to existing state-of-art models.

Abstract (translated)

近年来,人们对自然语言处理服务和Transformer模型的私有推理越来越关注。然而,现有的两方隐私保护方法仅仅考虑了NLU场景,而对于生成文本如翻译、对话和代码补全的私有推理仍然无法解决。此外,在迁移到NLG模型时,现有的隐私保护方法在推理速度方面表现较差,并且在训练阶段会出现收敛问题。为了解决这些问题,我们提出了Merge,一个适用于Transformer基于语言模型的快速私有文本生成框架。具体来说,Merge将输出隐状态用作单词嵌入,绕过嵌入计算,并重新安排Transformer模块中的线性操作,以加速前进过程。基于这两个优化,广泛的实验结果表明,Merge可以在序列长度为512的情况下实现26.5倍速度提升,并减少80\%的通信字节,而现有最先进的模型速度提升可以达到10倍。

URL

https://arxiv.org/abs/2305.15769

PDF

https://arxiv.org/pdf/2305.15769.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot