Paper Reading AI Learner

Large Margin Structured Convolution Operator for Thermal Infrared Object Tracking

2018-07-19 08:21:20
Peng Gao, Yipeng Ma, Ke Song, Chao Li, Fei Wang, Liyi Xiao

Abstract

Compared with visible object tracking, thermal infrared (TIR) object tracking can track an arbitrary target in total darkness since it cannot be influenced by illumination variations. However, there are many unwanted attributes that constrain the potentials of TIR tracking, such as the absence of visual color patterns and low resolutions. Recently, structured output support vector machine (SOSVM) and discriminative correlation filter (DCF) have been successfully applied to visible object tracking, respectively. Motivated by these, in this paper, we propose a large margin structured convolution operator (LMSCO) to achieve efficient TIR object tracking. To improve the tracking performance, we employ the spatial regularization and implicit interpolation to obtain continuous deep feature maps, including deep appearance features and deep motion features, of the TIR targets. Finally, a collaborative optimization strategy is exploited to significantly update the operators. Our approach not only inherits the advantage of the strong discriminative capability of SOSVM but also achieves accurate and robust tracking with higher-dimensional features and more dense samples. To the best of our knowledge, we are the first to incorporate the advantages of DCF and SOSVM for TIR object tracking. Comprehensive evaluations on two thermal infrared tracking benchmarks, i.e. VOT-TIR2015 and VOT-TIR2016, clearly demonstrate that our LMSCO tracker achieves impressive results and outperforms most state-of-the-art trackers in terms of accuracy and robustness with sufficient frame rate.

Abstract (translated)

与可见物体跟踪相比,热红外(TIR)物体跟踪可以在完全黑暗中跟踪任意目标,因为它不受照明变化的影响。然而,有许多不需要的属性限制了TIR跟踪的潜力,例如没有视觉颜色模式和低分辨率。最近,结构化输出支持向量机(SOSVM)和判别相关滤波器(DCF)已成功应用于可见对象跟踪。受这些推动,在本文中,我们提出了一个大边缘结构卷积算子(LMSCO)来实现有效的TIR对象跟踪。为了提高跟踪性能,我们采用空间正则化和隐式插值来获得TIR目标的连续深度特征图,包括深度外观特征和深度运动特征。最后,利用协作优化策略来显着更新运营商。我们的方法不仅继承了SOSVM强大的判别能力的优势,而且还通过更高维度的特征和更密集的样本实现了精确和鲁棒的跟踪。据我们所知,我们是第一个将DCF和SOSVM的优势结合到TIR对象跟踪中的人。对两个热红外跟踪基准测试(即VOT-TIR2015和VOT-TIR2016)的综合评估清楚地表明,我们的LMSCO跟踪器取得了令人瞩目的成果,并且在准确性和稳健性方面优于大多数最先进的跟踪器,具有足够的帧速率。

URL

https://arxiv.org/abs/1804.07006

PDF

https://arxiv.org/pdf/1804.07006.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot