Learning Sparse 2D Temporal Adjacent Networks for Temporal Action Localization

2019-12-08 04:16:28

Songyang Zhang, Houwen Peng, Le Yang, Jianlong Fu, Jiebo Luo

arXiv_CV

arXiv_CV Relation Sparse Pose Action Action_Localization

Abstract
Abstract (translated)
URL
PDF

Abstract

In this report, we introduce the Winner method for HACS Temporal Action Localization Challenge 2019. Temporal action localization is challenging since a target proposal may be related to several other candidate proposals in an untrimmed video. Existing methods cannot tackle this challenge well since temporal proposals are considered individually and their temporal dependencies are neglected. To address this issue, we propose sparse 2D temporal adjacent networks to model the temporal relationship between candidate proposals. This method is built upon the recent proposed 2D-TAN approach. The sampling strategy in 2D-TAN introduces the unbalanced context problem, where short proposals can perceive more context than long proposals. Therefore, we further propose a Sparse 2D Temporal Adjacent Network (S-2D-TAN). It is capable of involving more context information for long proposals and further learning discriminative features from them. By combining our S-2D-TAN with a simple action classifier, our method achieves a mAP of 23.49 on the test set, which win the first place in the HACS challenge.

Abstract (translated)

URL

https://arxiv.org/abs/1912.03612

PDF

https://arxiv.org/pdf/1912.03612.pdf