Abstract
Small large language models (sLLMs) offer the advantage of being lightweight and efficient, which makes them suitable for resource-constrained environments. However, sLLMs often struggle to maintain topic consistency in task-oriented dialogue systems, which is critical for scenarios such as service chatbots. Specifically, it is important to ensure that the model denies off-topic or malicious inputs and adheres to its intended functionality so as to prevent potential misuse and uphold reliability. Towards this, existing activation engineering approaches have been proposed to manipulate internal activations during inference. While these methods are effective in certain scenarios, our preliminary experiments reveal their limitations in ensuring topic adherence. Therefore, to address this, we propose a novel approach termed Entropy-scaled Steering vectors for Topic Maintenance (EnSToM). EnSToM dynamically adjusts the steering intensity based on input uncertainty, which allows the model to handle off-topic distractors effectively while preserving on-topic accuracy. Our experiments demonstrate that EnSToM achieves significant performance gain with a relatively small data size compared to fine-tuning approaches. By improving topic adherence without compromising efficiency, our approach provides a robust solution for enhancing sLLM-based dialogue systems.
Abstract (translated)
小型大语言模型(简称sLLMs)因其轻量级和高效的特点,在资源受限的环境中表现出色。然而,这些模型在面向任务的对话系统中通常难以保持话题一致性,特别是在服务聊天机器人等场景下,这一问题尤为重要。具体而言,确保模型能够拒绝无关或恶意输入并坚持其预期功能以防止潜在滥用及保证可靠性至关重要。为此,现有激活工程方法提出通过调整推理过程中的内部激活来解决此类问题。尽管这些方法在某些情况下表现出有效性,但我们的初步实验表明它们在维持话题一致性方面存在局限性。 因此,为了应对这一挑战,我们提出了一个名为熵缩放引导向量维护主题(EnSToM)的新方法。EnSToM根据输入的不确定性动态调整控制强度,使得模型能够有效处理无关干扰信息的同时保持与主题相关的准确性。实验结果表明,相较于微调方法,EnSToM在相对较小的数据集上实现了显著的性能提升。 通过提高话题一致性而不牺牲效率,我们的方法为增强基于sLLM的对话系统提供了一个稳健的解决方案。
URL
https://arxiv.org/abs/2505.16526