Abstract
Most existing stylistic text rewriting methods operate on a sentence level, but ignoring the broader context of the text can lead to generic, ambiguous, and incoherent rewrites. In this paper, we propose the integration of preceding textual context into both the rewriting and evaluation stages of stylistic text rewriting, focusing on formality, toxicity, and sentiment transfer tasks. We conduct a comparative evaluation of rewriting through few-shot prompting of GPT-3.5 and GPT NeoX, comparing non-contextual rewrites to contextual rewrites. Our experiments show that humans often prefer contextual rewrites over non-contextual ones, but automatic metrics (e.g., BLEU, sBERT) do not. To bridge this gap, we propose context-infused versions of common automatic metrics, and show that these better reflect human preferences. Overall, our paper highlights the importance of integrating preceding textual context into both the rewriting and evaluation stages of stylistic text rewriting.
Abstract (translated)
现有的风格化文本改写方法通常只在句子级别上运行,但忽略文本的更广阔的背景可能会导致通用、含糊和不一致的改写。在本文中,我们提议将前面的文本背景融入到风格化文本改写的改写和评估阶段中,重点关注正式性、毒性和情感传递任务。我们通过几个回合的引导技术(GPT-3.5和GPT NeoX)对改写进行了比较评估,并将非上下文改写和上下文改写进行比较。我们的实验结果表明,人类通常更喜欢上下文改写,但自动度量(如BLEU和sBERT)并不这样认为。为了弥补这一差距,我们提出了上下文整合的常用自动度量版本,并表明这些更好地反映了人类偏好。总之,我们的论文强调了将前面的文本背景融入到风格化文本改写的改写和评估阶段中的重要性。
URL
https://arxiv.org/abs/2305.14755