Paper Reading AI Learner

PlanarTrack: A Large-scale Challenging Benchmark for Planar Object Tracking

2023-03-14 04:48:18
Xinran Liu, Xiaoqiong Liu, Ziruo Yi, Xin Zhou, Thanh Le, Libo Zhang, Yan Huang, Qing Yang, Heng Fan

Abstract

Planar object tracking is a critical computer vision problem and has drawn increasing interest owing to its key roles in robotics, augmented reality, etc. Despite rapid progress, its further development, especially in the deep learning era, is largely hindered due to the lack of large-scale challenging benchmarks. Addressing this, we introduce PlanarTrack, a large-scale challenging planar tracking benchmark. Specifically, PlanarTrack consists of 1,000 videos with more than 490K images. All these videos are collected in complex unconstrained scenarios from the wild, which makes PlanarTrack, compared with existing benchmarks, more challenging but realistic for real-world applications. To ensure the high-quality annotation, each frame in PlanarTrack is manually labeled using four corners with multiple-round careful inspection and refinement. To our best knowledge, PlanarTrack, to date, is the largest and most challenging dataset dedicated to planar object tracking. In order to analyze the proposed PlanarTrack, we evaluate 10 planar trackers and conduct comprehensive comparisons and in-depth analysis. Our results, not surprisingly, demonstrate that current top-performing planar trackers degenerate significantly on the challenging PlanarTrack and more efforts are needed to improve planar tracking in the future. In addition, we further derive a variant named PlanarTrack$_{\mathbf{BB}}$ for generic object tracking from PlanarTrack. Our evaluation of 10 excellent generic trackers on PlanarTrack$_{\mathrm{BB}}$ manifests that, surprisingly, PlanarTrack$_{\mathrm{BB}}$ is even more challenging than several popular generic tracking benchmarks and more attention should be paid to handle such planar objects, though they are rigid. All benchmarks and evaluations will be released at the project webpage.

Abstract (translated)

平面对象跟踪是一个重要的计算机视觉问题,并因为它在机器人、增强现实等领域的关键作用而越来越引起关注。尽管取得了进展,但平面对象跟踪的进一步开发,特别是在深度学习时代,很大程度上受到了缺乏大规模挑战性基准的限制。为了解决这个问题,我们引入了PlanarTrack,它是一个大规模的挑战性平面跟踪基准。具体来说,PlanarTrack包含超过490K张图像的1000部电影。所有这些电影都是在复杂的无约束场景中从野外收集的,这使得PlanarTrack相对于现有的基准来说更具挑战性,但对于现实世界的应用来说更为真实。为了确保高质量的标注,每个帧在PlanarTrack中手动使用四个角落进行手动标注,并进行多次仔细检查和精修。据我们所知,PlanarTrack是目前专门用于平面对象跟踪的最大且最具挑战性的dataset。为了分析提出的PlanarTrack,我们评估了10个平面跟踪器,并进行了全面的比较和深入分析。我们的结果显示,当前最好的平面跟踪器在挑战性的PlanarTrack上表现急剧恶化,因此需要更多的努力来改进未来的平面跟踪。此外,我们还从PlanarTrack中推导了一个名为PlanarTrack$_{mathbf{BB}}$的变体,用于一般对象跟踪,我们的评估结果表明,PlanarTrack$_{mathrm{BB}}$比一些流行的跟踪基准更具挑战性,尽管这些对象是坚硬的。所有基准和评估都将在项目网站上发布。

URL

https://arxiv.org/abs/2303.07625

PDF

https://arxiv.org/pdf/2303.07625.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot