Transformer-based Action recognition in hand-object interacting scenarios

Abstract
Abstract (translated)
URL
PDF

Abstract

This report describes the 2nd place solution to the ECCV 2022 Human Body, Hands, and Activities (HBHA) from Egocentric and Multi-view Cameras Challenge: Action Recognition. This challenge aims to recognize hand-object interaction in an egocentric view. We propose a framework that estimates keypoints of two hands and an object with a Transformer-based keypoint estimator and recognizes actions based on the estimated keypoints. We achieved a top-1 accuracy of 87.19% on the testset.

Abstract (translated)

URL

https://arxiv.org/abs/2210.11387

PDF

https://arxiv.org/pdf/2210.11387.pdf