Abstract
In recent years great success has been achieved in sentiment classification for English, thanks in part to the availability of copious annotated resources. Unfortunately, most languages do not enjoy such an abundance of labeled data. To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exists. ADAN has two discriminative branches: a sentiment classifier and an adversarial language discriminator. Both branches take input from a shared feature extractor to learn hidden representations that are simultaneously indicative for the classification task and invariant across languages. Experiments on Chinese and Arabic sentiment classification demonstrate that ADAN significantly outperforms state-of-the-art systems.
Abstract (translated)
近年来,英语情感分类取得了巨大成功,这部分归功于丰富的注释资源的可用性。不幸的是,大多数语言都不享有如此丰富的标签数据。为了解决没有足够注释数据的低资源语言中的情感分类问题,我们建议使用对抗深度平均网络(ADAN)将从资源丰富的源语言上的标记数据中学到的知识转移到只有未标记数据的低资源语言存在。 ADAN有两个判别分支:情感分类器和对抗语言鉴别器。两个分支都从共享特征提取器获取输入,以学习同时指示分类任务和跨语言不变的隐藏表示。中国和阿拉伯语情绪分类的实验表明,ADAN明显优于最先进的系统。
URL
https://arxiv.org/abs/1606.01614