AdaFPP: Adapt-Focused Bi-Propagating Prototype Learning for Panoramic Activity Recognition

Abstract
Abstract (translated)
URL
PDF

Abstract

Panoramic Activity Recognition (PAR) aims to identify multi-granularity behaviors performed by multiple persons in panoramic scenes, including individual activities, group activities, and global activities. Previous methods 1) heavily rely on manually annotated detection boxes in training and inference, hindering further practical deployment; or 2) directly employ normal detectors to detect multiple persons with varying size and spatial occlusion in panoramic scenes, blocking the performance gain of PAR. To this end, we consider learning a detector adapting varying-size occluded persons, which is optimized along with the recognition module in the all-in-one framework. Therefore, we propose a novel Adapt-Focused bi-Propagating Prototype learning (AdaFPP) framework to jointly recognize individual, group, and global activities in panoramic activity scenes by learning an adapt-focused detector and multi-granularity prototypes as the pretext tasks in an end-to-end way. Specifically, to accommodate the varying sizes and spatial occlusion of multiple persons in crowed panoramic scenes, we introduce a panoramic adapt-focuser, achieving the size-adapting detection of individuals by comprehensively selecting and performing fine-grained detections on object-dense sub-regions identified through original detections. In addition, to mitigate information loss due to inaccurate individual localizations, we introduce a bi-propagation prototyper that promotes closed-loop interaction and informative consistency across different granularities by facilitating bidirectional information propagation among the individual, group, and global levels. Extensive experiments demonstrate the significant performance of AdaFPP and emphasize its powerful applicability for PAR.

Abstract (translated)

全景活动识别（PAR）旨在识别在全景场景中多个人的多种粒度行为，包括个人活动、群体活动和全局活动。之前的方法1)在训练和推理过程中严重依赖手动标注检测框，阻碍了进一步的实用部署；或者2)直接使用大小和空间遮挡变化多样的个人检测器来检测多个大小和空间遮挡变化多样的个人，阻碍了PAR的性能提升。因此，我们考虑学习一个自适应大小遮挡的检测器，该检测器与识别模块在一体化框架中进行优化。为此，我们提出了一个名为Adapt-Focused bi-Propagating Prototype learning (AdoFPP)的新颖对齐传播原型学习（AdaFPP）框架，通过学习适应关注的检测器和多粒度原型作为端到端任务的前馈任务，以同时识别全景活动场景中的个人、群体和全局活动。具体来说，为了适应 crowed（拥挤）全景场景中多个人的不同大小和空间遮挡，我们引入了一个全景适应焦点检测器，通过全面选择并执行在原始检测到的物体密集子区域中的精细检测，实现对个人大小的适应检测。此外，为了减轻由于不准确的个人局部定位而产生的信息损失，我们引入了一个双向信息传播原型，通过促进个体、群体和全局层次之间的双向信息传播，实现信息的有用一致性。大量的实验证明AdoFPP 的性能显著提高，并强调了其在PAR上的强大应用价值。

URL

https://arxiv.org/abs/2405.02538

PDF

https://arxiv.org/pdf/2405.02538.pdf

AdaFPP: Adapt-Focused Bi-Propagating Prototype Learning for Panoramic Activity Recognition

Abstract

Abstract (translated)

URL

PDF Copy

PDF