Abstract
The emergence of various adapters, including Low-Rank Adaptation (LoRA) applied from the field of natural language processing, has allowed diffusion models to personalize image generation at a low cost. However, due to the various challenges including limited datasets and shortage of regularization and computation resources, adapter training often results in unsatisfactory outcomes, leading to the corruption of the backbone model's prior knowledge. One of the well known phenomena is the loss of diversity in object generation, especially within the same class which leads to generating almost identical objects with minor variations. This poses challenges in generation capabilities. To solve this issue, we present Contrastive Adapter Training (CAT), a simple yet effective strategy to enhance adapter training through the application of CAT loss. Our approach facilitates the preservation of the base model's original knowledge when the model initiates adapters. Furthermore, we introduce the Knowledge Preservation Score (KPS) to evaluate CAT's ability to keep the former information. We qualitatively and quantitatively compare CAT's improvement. Finally, we mention the possibility of CAT in the aspects of multi-concept adapter and optimization.
Abstract (translated)
各种适配器的出现,包括自然语言处理领域中的低秩适应(LoRA)应用,使得扩散模型在低成本下可以个性化图像生成。然而,由于各种挑战,包括有限的数据集和缺乏 regularization 和计算资源,适配器训练通常导致不满意的成果,导致基础知识模型的先验知识受到污染。一个著名的现象是在同一类别的物体生成中失去了多样性,尤其是在同一类中生成几乎相同的物体,给生成能力带来了挑战。为了解决这个问题,我们提出了 Contrastive Adapter Training (CAT),一种简单而有效的策略通过应用 CAT损失来增强适配器训练。我们的方法在模型启动适配器时保留基础模型的原始知识。此外,我们还引入了知识保留分数(KPS)来评估 CAT保留先信息的能力。我们定性和定量地比较了 CAT的改进。最后,我们还提到了 CAT在多概念适配器和优化方面的可能性。
URL
https://arxiv.org/abs/2404.07554