Action Quality Assessment using Transformers

2022-07-20 17:00:13

Abhay Iyer, Mohammad Alali, Hemanth Bodala, Sunit Vaidya

arXiv_CV

arXiv_CV CNN QA Relation Transformer Pose Action

Abstract
Abstract (translated)
URL
PDF

Abstract

Action quality assessment (AQA) is an active research problem in video-based applications that is a challenging task due to the score variance per frame. Existing methods address this problem via convolutional-based approaches but suffer from its limitation of effectively capturing long-range dependencies. With the recent advancements in Transformers, we show that they are a suitable alternative to the conventional convolutional-based architectures. Specifically, can transformer-based models solve the task of AQA by effectively capturing long-range dependencies, parallelizing computation, and providing a wider receptive field for diving videos? To demonstrate the effectiveness of our proposed architectures, we conducted comprehensive experiments and achieved a competitive Spearman correlation score of 0.9317. Additionally, we explore the hyperparameters effect on the model's performance and pave a new path for exploiting Transformers in AQA.

Abstract (translated)

URL

https://arxiv.org/abs/2207.12318

PDF

https://arxiv.org/pdf/2207.12318.pdf