Faceptor: A Generalist Model for Face Perception

Abstract
Abstract (translated)
URL
PDF

Abstract

With the comprehensive research conducted on various face analysis tasks, there is a growing interest among researchers to develop a unified approach to face perception. Existing methods mainly discuss unified representation and training, which lack task extensibility and application efficiency. To tackle this issue, we focus on the unified model structure, exploring a face generalist model. As an intuitive design, Naive Faceptor enables tasks with the same output shape and granularity to share the structural design of the standardized output head, achieving improved task extensibility. Furthermore, Faceptor is proposed to adopt a well-designed single-encoder dual-decoder architecture, allowing task-specific queries to represent new-coming semantics. This design enhances the unification of model structure while improving application efficiency in terms of storage overhead. Additionally, we introduce Layer-Attention into Faceptor, enabling the model to adaptively select features from optimal layers to perform the desired tasks. Through joint training on 13 face perception datasets, Faceptor achieves exceptional performance in facial landmark localization, face parsing, age estimation, expression recognition, binary attribute classification, and face recognition, achieving or surpassing specialized methods in most tasks. Our training framework can also be applied to auxiliary supervised learning, significantly improving performance in data-sparse tasks such as age estimation and expression recognition. The code and models will be made publicly available at this https URL.

Abstract (translated)

在对各种面部分析任务进行全面的调查和研究后，越来越多的研究者对发展统一的面部感知方法产生了浓厚的兴趣。现有的方法主要讨论了统一的表示和训练，缺乏任务的扩展性和应用效率。为解决这个问题，我们关注统一的模型结构，研究了一个面部通用模型。作为一种直观的设计，Naive Faceptor使具有相同输出形状和粒度的任务可以共享标准输出头的结构设计，从而实现提高任务扩展性的目标。此外，Faceptor还提出了一个设计良好的单编码器双解码器架构，允许任务特定的查询表示新兴的语义。这种设计在提高模型结构统一的同时，提高了存储开销的应用效率。此外，我们还引入了层注意力机制到Faceptor中，使模型能够动态选择最优层中的特征来执行所需任务。通过在13个面部感知数据集上进行联合训练，Faceptor在面部关键点定位、面部解析、年龄估计、表情识别、二进制属性分类和面部识别等方面取得了惊人的性能，超越了大多数专用方法。我们的训练框架也可以应用于辅助监督学习，在数据稀疏任务（如年龄估计和表情识别）中显著提高性能。代码和模型将在这个https:// URL上公开发布。

URL

https://arxiv.org/abs/2403.09500

PDF

https://arxiv.org/pdf/2403.09500.pdf

Faceptor: A Generalist Model for Face Perception

Abstract

Abstract (translated)

URL

PDF Copy

PDF