Abstract
Text detoxification is a textual style transfer (TST) task where a text is paraphrased from a toxic surface form, e.g. featuring rude words, to the neutral register. Recently, text detoxification methods found their applications in various task such as detoxification of Large Language Models (LLMs) (Leong et al., 2023; He et al., 2024; Tang et al., 2023) and toxic speech combating in social networks (Deng et al., 2023; Mun et al., 2023; Agarwal et al., 2023). All these applications are extremely important to ensure safe communication in modern digital worlds. However, the previous approaches for parallel text detoxification corpora collection -- ParaDetox (Logacheva et al., 2022) and APPADIA (Atwell et al., 2022) -- were explored only in monolingual setup. In this work, we aim to extend ParaDetox pipeline to multiple languages presenting MultiParaDetox to automate parallel detoxification corpus collection for potentially any language. Then, we experiment with different text detoxification models -- from unsupervised baselines to LLMs and fine-tuned models on the presented parallel corpora -- showing the great benefit of parallel corpus presence to obtain state-of-the-art text detoxification models for any language.
Abstract (translated)
文本去毒是一种文本风格迁移(TST)任务,其中从具有恶意的文本形式(例如包含粗言的文本)将其转移到中立语域。最近,文本去毒方法在各种任务中的应用得到了发现,例如对大型语言模型(LLMs)的净化(Leong et al., 2023; He et al., 2024; Tang et al., 2023)和社会网络中的有毒言论对抗(Deng et al., 2023; Mun et al., 2023; Agarwal et al., 2023)。所有这些应用都非常重要,以确保现代数字世界中安全交流。然而,之前用于并行文本去毒数据集的构建的方法——ParaDetox(Logacheva et al., 2022)和APPADIA(Atwell et al., 2022)——仅在单语种设置中进行研究。在这项工作中,我们旨在将ParaDetox管道扩展到多种语言,并使用MultiParaDetox在多个语言上自动构建并行去毒数据集,从而实现对任何语言的平行文本去毒。然后,我们实验了不同的文本去毒模型——从无监督的基线到LLM和微调模型——展示了平行数据集存在的巨大好处,为任何语言获取最先进文本去毒模型。
URL
https://arxiv.org/abs/2404.02037