Abstract
With the proliferation of large language models (LLMs), the comprehensive alignment of such models across multiple tasks has emerged as a critical area of research. Existing alignment methodologies primarily address single task, such as multi-turn dialogue, coding, mathematical problem-solving, and tool usage. However, AI-driven products that leverage language models usually necessitate a fusion of these abilities to function effectively in real-world scenarios. Moreover, the considerable computational resources required for proper alignment of LLMs underscore the need for a more robust, efficient, and encompassing approach to multi-task alignment, ensuring improved generative performance. In response to these challenges, we introduce a novel technique termed Mixture-of-Instructions (MoI), which employs a strategy of instruction concatenation combined with diverse system prompts to boost the alignment efficiency of language models. We have also compiled a diverse set of seven benchmark datasets to rigorously evaluate the alignment efficacy of the MoI-enhanced language model. Our methodology was applied to the open-source Qwen-7B-chat model, culminating in the development of Qwen-SFT-MoI. This enhanced model demonstrates significant advancements in generative capabilities across coding, mathematics, and tool use tasks.
Abstract (translated)
随着大型语言模型(LLMs)的普及,跨任务全面对齐这些模型成为了一个关键的研究领域。现有的对齐方法主要关注单一任务,例如多轮对话、编码、数学问题解决和工具使用。然而,利用语言模型的AI驱动产品通常需要将这些能力进行融合,以在现实场景中有效运行。此外,为正确对齐LLMs所需要的大量计算资源,也凸显了需要更健壮、高效和包容的对多任务对齐方法的需求,以提高生成性能。为了应对这些挑战,我们引入了一种名为混合指令(MoI)的新技术,它结合了指令串联和多样系统提示来提高语言模型的对齐效率。我们还收集了七个具有不同应用领域的基准数据集,以严格评估MoI增强语言模型的对齐效果。我们的方法应用于开源的Qwen-7B聊天模型,最终开发出了Qwen-SFT-MoI增强模型。这种增强模型在编码、数学和工具使用任务上显著提高了生成能力。
URL
https://arxiv.org/abs/2404.18410