Nyonic Technical Report

2024-04-24 07:38:44

Junfeng Tian, Rui Wang, Cong Li, Yudong Zhou, Jun Liu, Jun Wang

arXiv_CL

arXiv_CL Embedding Language_Model

Abstract
Abstract (translated)
URL
PDF

Abstract

This report details the development and key achievements of our latest language model designed for custom large language models. The advancements introduced include a novel Online Data Scheduler that supports flexible training data adjustments and curriculum learning. The model's architecture is fortified with state-of-the-art techniques such as Rotary Positional Embeddings, QK-LayerNorm, and a specially crafted multilingual tokenizer to enhance stability and performance. Moreover, our robust training framework incorporates advanced monitoring and rapid recovery features to ensure optimal efficiency. Our Wonton 7B model has demonstrated competitive performance on a range of multilingual and English benchmarks. Future developments will prioritize narrowing the performance gap with more extensively trained models, thereby enhancing the model's real-world efficacy and adaptability.GitHub: \url{this https URL}

Abstract (translated)

本报告详细介绍了我们最新的为定制大型语言模型而设计的语言模型的开发关键成就。引入的改进包括一个支持灵活训练数据调整和课程学习的新颖在线数据调度器。模型的架构由最先进的技术 such as Rotary Positional Embeddings, QK-LayerNorm 和专门设计的多语言标记符强化稳定性 and 性能。此外，我们的稳健训练框架包括先进的监控和快速恢复功能，以确保最佳效率。我们的Wonton 7B模型在多语言和英语基准测试中表现出竞争力的性能。未来的发展将优先考虑通过更广泛训练的模型来缩小性能差距，从而增强模型的真实世界效果和适应性。 GitHub：\url{this <https://github.com> URL}

URL

https://arxiv.org/abs/2404.15702

PDF

https://arxiv.org/pdf/2404.15702.pdf

Nyonic Technical Report

Abstract

Abstract (translated)

URL

PDF Copy

PDF