Paper Reading AI Learner

Fast derivation of neural network based document vectors with distance constraint and negative sampling

2018-07-29 13:49:00
Wei Li, Brian Mak

Abstract

A universal cross-lingual representation of documents is very important for many natural language processing tasks. In this paper, we present a document vectorization method which can effectively create document vectors via self-attention mechanism using a neural machine translation (NMT) framework. The model used by our method can be trained with parallel corpora that are unrelated to the task at hand. During testing, our method will take a monolingual document and convert it into a "Neural machine Translation framework based crosslingual Document Vector with distance constraint training" (cNTDV). cNTDV is a follow-up study from our previous research on the neural machine translation framework based document vector. The cNTDV can produce the document vector from a forward-pass of the encoder with fast speed. Moreover, it is trained with a distance constraint, so that the document vector obtained from different language pair is always consistent with each other. In a cross-lingual document classification task, our cNTDV embeddings surpass the published state-of-the-art performance in the English-to-German classification test, and, to our best knowledge, it also achieves the second best performance in German-to-English classification test. Comparing to our previous research, it does not need a translator in the testing process, which makes the model faster and more convenient.

Abstract (translated)

对于许多自然语言处理任务而言,文档的通用跨语言表示非常重要。在本文中,我们提出了一种文档向量化方法,它可以使用神经机器翻译(NMT)框架通过自我关注机制有效地创建文档向量。我们的方法使用的模型可以使用与手头任务无关的并行语料库进行训练。在测试过程中,我们的方法将采用单语文档并将其转换为“基于神经机器翻译框架的跨距文档向量与距离约束训练”(cNTDV)。 cNTDV是我们之前关于基于神经机器翻译框架的文档向量的研究的后续研究。 cNTDV可以快速地从编码器的正向通过产生文档向量。此外,它是用距离约束训练的,因此从不同语言对获得的文档向量总是彼此一致。在跨语言文档分类任务中,我们的cNTDV嵌入在英语 - 德语分类测试中超越了已发布的最先进的性能,并且据我们所知,它在德语中也达到了第二好的性能 - 英语分类测试。与我们之前的研究相比,它在测试过程中不需要翻译器,这使得模型更快更方便。

URL

https://arxiv.org/abs/1807.11057

PDF

https://arxiv.org/pdf/1807.11057.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot