Abstract
Aligning Large Language Models (LLMs) with human values and preferences is essential for making them helpful and safe. However, building efficient tools to perform alignment can be challenging, especially for the largest and most competent LLMs which often contain tens or hundreds of billions of parameters. We create NeMo-Aligner, a toolkit for model alignment that can efficiently scale to using hundreds of GPUs for training. NeMo-Aligner comes with highly optimized and scalable implementations for major paradigms of model alignment such as: Reinforcement Learning from Human Feedback (RLHF), Direct Preference Optimization (DPO), SteerLM, and Self-Play Fine-Tuning (SPIN). Additionally, our toolkit supports running most of the alignment techniques in a Parameter Efficient Fine-Tuning (PEFT) setting. NeMo-Aligner is designed for extensibility, allowing support for other alignment techniques with minimal effort. It is open-sourced with Apache 2.0 License and we invite community contributions at this https URL
Abstract (translated)
将大型语言模型(LLMs)与人类价值观和偏好对齐是使其有帮助和安全的充要条件。然而,构建高效的工具执行对齐可能具有挑战性,尤其是对于包含数十亿或数百亿个参数的大型和最强大的LLM。我们创建了NeMo-Aligner,一个用于模型对齐的工具包,可以高效地扩展到使用数百个GPU进行训练。NeMo-Aligner附带高度优化的可扩展实现,适用于主要模型对齐范式:强化学习来自人类反馈(RLHF)、直接偏好优化(DPO)、SteerLM和自玩微调(SPIN)。此外,我们的工具包支持在参数高效微调(PEFT)设置中运行大多数对齐技术。NeMo-Aligner旨在可扩展性,允许支持其他对齐技术,只需付出很少的努力。它使用Apache 2.0许可证开源,并邀请您在此链接处为社区贡献:https://www.nemoaligner.org/
URL
https://arxiv.org/abs/2405.01481