Paper Reading AI Learner

Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models

2024-09-27 17:44:58
Jiaming Li, Lei Zhang, Yunshui Li, Ziqiang Liu, yuelin bai, Run Luo, Longze Chen, Min Yang

Abstract

The instruction-following ability of large language models enables humans to interact with AI agents in a natural way. However, when required to generate responses of a specific length, large language models often struggle to meet users' needs due to their inherent difficulty in accurately perceiving numerical constraints. To explore the ability of large language models to control the length of generated responses, we propose the Target Length Generation Task (TLG) and design two metrics, Precise Match (PM) and Flexible Match (FM) to evaluate the model's performance in adhering to specified response lengths. Furthermore, we introduce a novel, model-agnostic approach called Ruler, which employs Meta Length Tokens (MLTs) to enhance the instruction-following ability of large language models under length-constrained instructions. Specifically, Ruler equips LLMs with the ability to generate responses of a specified length based on length constraints within the instructions. Moreover, Ruler can automatically generate appropriate MLT when length constraints are not explicitly provided, demonstrating excellent versatility and generalization. Comprehensive experiments show the effectiveness of Ruler across different LLMs on Target Length Generation Task, e.g., at All Level 27.97 average gain on PM, 29.57 average gain on FM. In addition, we conduct extensive ablation experiments to further substantiate the efficacy and generalization of Ruler. Our code and data is available at this https URL.

Abstract (translated)

大规模语言模型的指导能力使得人类以自然的方式与AI代理互动。然而,在需要生成特定长度的响应时,由于其对准确感知数值约束的固有困难,大型语言模型往往难以满足用户的需求。为了探索大型语言模型控制生成响应长度的能力,我们提出了目标长度生成任务(TLG),并设计了一个指标,称为精确匹配(PM)和灵活匹配(FM)来评估模型在遵守指定响应长度方面的性能。此外,我们还引入了一种新的、与模型无关的方法,称为Ruler,它采用元长度标记(MLTs)来增强大型语言模型在长度约束指令下的指导能力。具体来说,Ruler为LLM提供了根据指令内长度约束生成指定长度的响应的能力。此外,Ruler可以自动生成适当的MLT,即使没有明确提供长度约束,也表现出出色的可变性和泛化能力。 comprehensive experiments表明,Ruler在不同的LLM上在Target Length Generation Task上的有效性,例如在PM中的平均提升率为27.97,在FM中的平均提升率为29.57。此外,我们进行了广泛的消融实验,以进一步证实Ruler的效力和泛化能力。我们的代码和数据可在此处访问:https://url.com/

URL

https://arxiv.org/abs/2409.18943

PDF

https://arxiv.org/pdf/2409.18943.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot