Paper Reading AI Learner

IDET: Iterative Difference-Enhanced Transformers for High-Quality Change Detection

2022-07-15 07:40:29
Rui Huang, Ruofei Wang, Qing Guo, Yuxiang Zhang, Wei Fan

Abstract

Change detection (CD) aims to detect change regions within an image pair captured at different times, playing a significant role for diverse real-world applications. Nevertheless, most of existing works focus on designing advanced network architectures to map the feature difference to the final change map while ignoring the influence of the quality of the feature difference. In this paper, we study the CD from a new perspective, i.e., how to optimize the feature difference to highlight changes and suppress unchanged regions, and propose a novel module denoted as iterative difference-enhanced transformers (IDET). IDET contains three transformers: two transformers for extracting the long-range information of the two images and one transformer for enhancing the feature difference. In contrast to the previous transformers, the third transformer takes the outputs of the first two transformers to guide the enhancement of the feature difference iteratively. To achieve more effective refinement, we further propose the multi-scale IDET-based change detection that uses multi-scale representations of the images for multiple feature difference refinements and proposes a coarse-to-fine fusion strategy to combine all refinements. Our final CD method outperforms seven state-of-the-art methods on six large-scale datasets under diverse application scenarios, which demonstrates the importance of feature difference enhancements and the effectiveness of IDET.

Abstract (translated)

URL

https://arxiv.org/abs/2207.09240

PDF

https://arxiv.org/pdf/2207.09240.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot