Abstract
For computers to recognize human emotions, expression classification is an equally important problem in the human-computer interaction area. In the 3rd Affective Behavior Analysis In-The-Wild competition, the task of expression classification includes 8 classes including 6 basic expressions of human faces from videos. In this paper, we perform combination representation from RegNet, Attention module, and Transformer Encoder for the expression classification task. We achieve 35.87 \% for F1-score on the validation set of Aff-Wild2 dataset. This result shows the effectiveness of the proposed architecture.
Abstract (translated)
URL
https://arxiv.org/abs/2203.12899