Abstract
Existing person re-identification methods have achieved remarkable advances in appearance-based identity association across homogeneous cameras, such as ground-ground matching. However, as a more practical scenario, aerial-ground person re-identification (AGPReID) among heterogeneous cameras has received minimal attention. To alleviate the disruption of discriminative identity representation by dramatic view discrepancy as the most significant challenge in AGPReID, the view-decoupled transformer (VDT) is proposed as a simple yet effective framework. Two major components are designed in VDT to decouple view-related and view-unrelated features, namely hierarchical subtractive separation and orthogonal loss, where the former separates these two features inside the VDT, and the latter constrains these two to be independent. In addition, we contribute a large-scale AGPReID dataset called CARGO, consisting of five/eight aerial/ground cameras, 5,000 identities, and 108,563 images. Experiments on two datasets show that VDT is a feasible and effective solution for AGPReID, surpassing the previous method on mAP/Rank1 by up to 5.0%/2.7% on CARGO and 3.7%/5.2% on AG-ReID, keeping the same magnitude of computational complexity. Our project is available at this https URL
Abstract (translated)
目前,在基于外观的个体识别方法已经在均匀相机中取得了显著的进步,例如地面地面匹配。然而,作为更实际的场景,异质相机中的航空地面人物识别(AGPReID)受到了很少的关注。为了减轻由于显著的视差差异导致的区分性身份表示中断,我们提出了一个简单的但有效的框架——视解耦变压器(VDT)。 VDT有两个主要组成部分,用于解耦视相关和视无关特征。具体来说,前者在VDT内部分离这两个特征,后者则约束这两个特征相互独立。此外,我们还提出了一个名为CARGO的大规模AGPReID数据集,包括5/8个航空/地面相机,5,000个个体和108,563个图像。在两个数据集上的实验结果表明,VDT对于AGPReID是一个可行的且有效的解决方案,在CARGO数据集上比前方法提高了5.0%/2.7%的mAP/Rank1,而在AG-ReID数据集上提高了3.7%/5.2%的性能,同时保持相同的计算复杂度。我们的项目可以在这个https://url上找到。
URL
https://arxiv.org/abs/2403.14513