Abstract
Unsupervised text style transfer aims to transfer the underlying style of text but keep its main content unchanged without parallel data. Most existing methods typically follow two steps: first separating the content from the original style, and then fusing the content with the desired style. However, the separation in the first step is challenging because the content and style interact in subtle ways in natural language. Therefore, in this paper, we propose a dual reinforcement learning framework to directly transfer the style of the text via a one-step mapping model, without any separation of content and style. Specifically, we consider the learning of the source-to-target and target-to-source mappings as a dual task, and two rewards are designed based on such a dual structure to reflect the style accuracy and content preservation, respectively. In this way, the two one-step mapping models can be trained via reinforcement learning, without any use of parallel data. Automatic evaluations show that our model outperforms the state-of-the-art systems by a large margin, especially with more than 8 BLEU points improvement averaged on two benchmark datasets. Human evaluations also validate the effectiveness of our model in terms of style accuracy, content preservation and fluency. Our code and data, including outputs of all baselines and our model are available at this https URL.
Abstract (translated)
无监督文本样式转换的目的是传递文本的底层样式,但在没有并行数据的情况下保持其主要内容不变。大多数现有的方法通常遵循两个步骤:首先将内容与原始样式分离,然后将内容与所需样式融合。然而,第一步的分离是一个挑战,因为内容和风格在自然语言中以微妙的方式相互作用。因此,本文提出了一个双强化学习框架,通过一步映射模型直接传递文本的风格,而不分离内容和风格。具体来说,我们将源到目标和目标到源映射的学习视为一项双重任务,并基于这种双重结构设计了两个奖励,分别反映样式的准确性和内容的保存。这样,两个一步映射模型就可以通过强化学习来训练,而不需要使用任何并行数据。自动评估表明,我们的模型在很大程度上优于最先进的系统,特别是在两个基准数据集上平均改进了8个以上的bleu点。人的评价也验证了我们模型在风格准确性、内容保存和流畅性方面的有效性。我们的代码和数据,包括所有基线的输出和我们的模型都可以在这个https URL上找到。
URL
https://arxiv.org/abs/1905.10060