Abstract
In this work, we explore how instance-level memorization in the teacher Neural Machine Translation (NMT) model gets inherited by the student model in sequence-level knowledge distillation (SeqKD). We find that despite not directly seeing the original training data, students memorize more than baseline models (models of the same size, trained on the original data) -- 3.4% for exact matches and 57% for extractive memorization -- and show increased hallucination rates. Further, under this SeqKD setting, we also characterize how students behave on specific training data subgroups, such as subgroups with low quality and specific counterfactual memorization (CM) scores, and find that students exhibit amplified denoising on low-quality subgroups. Finally, we propose a modification to SeqKD named Adaptive-SeqKD, which intervenes in SeqKD to reduce memorization and hallucinations. Overall, we recommend caution when applying SeqKD: students inherit both their teachers' superior performance and their fault modes, thereby requiring active monitoring.
Abstract (translated)
在这项工作中,我们探讨了教师神经机器翻译(NMT)模型中的实例级记忆如何通过序列级别的知识蒸馏(SeqKD)传递给学生模型。研究发现,尽管学生模型没有直接接触原始训练数据,但它们的记忆量比基线模型(与之大小相同,在原始数据上进行训练的模型)更多——精确匹配多出3.4%,提取式记忆多出57%——并且表现出更高的幻觉率。此外,在这种SeqKD设置下,我们还描述了学生模型在特定训练数据子集上的行为表现,例如质量低下的子组和具有特定反事实记忆(CM)得分的子组,并发现学生模型对低质量子组显示出更强的去噪效果。最后,我们提出了一种名为自适应SeqKD的SeqKD改进方法,通过干预减少学生的记忆量和幻觉现象。总体而言,我们在应用SeqKD时建议谨慎行事:学生不仅继承了教师的优点,也继承了他们的缺陷模式,因此需要进行积极监控。
URL
https://arxiv.org/abs/2502.01491