Abstract
The instruction-following ability of large language models enables humans to interact with AI agents in a natural way. However, when required to generate responses of a specific length, large language models often struggle to meet users' needs due to their inherent difficulty in accurately perceiving numerical constraints. To explore the ability of large language models to control the length of generated responses, we propose the Target Length Generation Task (TLG) and design two metrics, Precise Match (PM) and Flexible Match (FM) to evaluate the model's performance in adhering to specified response lengths. Furthermore, we introduce a novel, model-agnostic approach called Ruler, which employs Meta Length Tokens (MLTs) to enhance the instruction-following ability of large language models under length-constrained instructions. Specifically, Ruler equips LLMs with the ability to generate responses of a specified length based on length constraints within the instructions. Moreover, Ruler can automatically generate appropriate MLT when length constraints are not explicitly provided, demonstrating excellent versatility and generalization. Comprehensive experiments show the effectiveness of Ruler across different LLMs on Target Length Generation Task, e.g., at All Level 27.97 average gain on PM, 29.57 average gain on FM. In addition, we conduct extensive ablation experiments to further substantiate the efficacy and generalization of Ruler. Our code and data is available at this https URL.
Abstract (translated)
大规模语言模型的指导能力使得人类以自然的方式与AI代理互动。然而,在需要生成特定长度的响应时,由于其对准确感知数值约束的固有困难,大型语言模型往往难以满足用户的需求。为了探索大型语言模型控制生成响应长度的能力,我们提出了目标长度生成任务(TLG),并设计了一个指标,称为精确匹配(PM)和灵活匹配(FM)来评估模型在遵守指定响应长度方面的性能。此外,我们还引入了一种新的、与模型无关的方法,称为Ruler,它采用元长度标记(MLTs)来增强大型语言模型在长度约束指令下的指导能力。具体来说,Ruler为LLM提供了根据指令内长度约束生成指定长度的响应的能力。此外,Ruler可以自动生成适当的MLT,即使没有明确提供长度约束,也表现出出色的可变性和泛化能力。 comprehensive experiments表明,Ruler在不同的LLM上在Target Length Generation Task上的有效性,例如在PM中的平均提升率为27.97,在FM中的平均提升率为29.57。此外,我们进行了广泛的消融实验,以进一步证实Ruler的效力和泛化能力。我们的代码和数据可在此处访问:https://url.com/
URL
https://arxiv.org/abs/2409.18943