Paper Reading AI Learner

Scalable Change Retrieval Using Deep 3D Neural Codes

2019-04-07 00:26:32
Kojima Yusuke, Tanaka Kanji, Yang Naiming, Hirota Yuji

Abstract

We present a novel scalable framework for image change detection (ICD) from an on-board 3D imagery system. We argue that existing ICD systems are constrained by the time required to align a given query image with individual reference image coordinates. We utilize an invariant coordinate system (ICS) to replace the time-consuming image alignment with an offline pre-processing procedure. Our key contribution is an extension of the traditional image comparison-based ICD tasks to setups of the image retrieval (IR) task. We replace each component of the 3D ICD system, i.e., (1) image modeling, (2) image alignment, and (3) image differencing, with significantly efficient variants from the bag-of-words (BoW) IR paradigm. Further, we train a deep 3D feature extractor in an unsupervised manner using an unsupervised Siamese network and automatically collected training data. We conducted experiments on a challenging cross-season ICD task using a publicly available dataset and thereby validate the efficacy of the proposed approach.

Abstract (translated)

我们提出了一种新的可扩展框架,用于从车载3D图像系统中检测图像变化。我们认为,现有的ICD系统受将给定的查询图像与单个参考图像坐标对齐所需时间的限制。我们利用一个不变的坐标系(ICS)来代替一个离线预处理程序耗时的图像对齐。我们的主要贡献是将传统的基于图像比较的ICD任务扩展到图像检索(IR)任务的设置。我们将3D ICD系统的每个组成部分,即(1)图像建模,(2)图像对齐,以及(3)图像差异化,替换为单词袋(bow)红外模式的显著有效变体。此外,我们使用一个无监督的暹罗网络以无监督的方式训练一个深度3D特征抽取器,并自动收集训练数据。我们使用公开的数据集对一项具有挑战性的跨季节ICD任务进行了实验,从而验证了所提出方法的有效性。

URL

https://arxiv.org/abs/1904.03552

PDF

https://arxiv.org/pdf/1904.03552.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot