Paper Reading AI Learner

Detecting Building Changes with Off-Nadir Aerial Images

2023-01-26 04:04:14
Chao Pang, Jiang Wu, Jian Ding, Can Song, Gui-Song Xia

Abstract

The tilted viewing nature of the off-nadir aerial images brings severe challenges to the building change detection (BCD) problem: the mismatch of the nearby buildings and the semantic ambiguity of the building facades. To tackle these challenges, we present a multi-task guided change detection network model, named as MTGCD-Net. The proposed model approaches the specific BCD problem by designing three auxiliary tasks, including: (1) a pixel-wise classification task to predict the roofs and facades of buildings; (2) an auxiliary task for learning the roof-to-footprint offsets of each building to account for the misalignment between building roof instances; and (3) an auxiliary task for learning the identical roof matching flow between bi-temporal aerial images to tackle the building roof mismatch problem. These auxiliary tasks provide indispensable and complementary building parsing and matching information. The predictions of the auxiliary tasks are finally fused to the main building change detection branch with a multi-modal distillation module. To train and test models for the BCD problem with off-nadir aerial images, we create a new benchmark dataset, named BANDON. Extensive experiments demonstrate that our model achieves superior performance over the previous state-of-the-art competitors.

Abstract (translated)

逆塔顶 aerial 图像的倾斜观看性质给建筑变化检测 (BCD) 问题带来了严重的挑战:相邻建筑物的差异和建筑立面的语义歧义。为了解决这些问题,我们提出了一种多任务引导的变化检测网络模型,称为 MTGCD-Net。该模型通过设计三个辅助任务来逼近具体的 BCD 问题,包括:(1) 像素级分类任务预测建筑物的屋顶和立面;(2) 一个辅助任务学习每个建筑物的屋顶到占地面积的offset,以考虑建筑物立面实例之间的不对齐;(3) 一个辅助任务学习两期 aerial 图像之间的相同屋顶匹配流,以解决建筑物立面不匹配问题。这些辅助任务提供了不可或缺的和互补的建筑解析和匹配信息。辅助任务的预测最终通过多模态蒸馏模块与主建筑变化检测分支融合。为了训练和测试模型以处理逆塔顶 aerial 图像的 BCD 问题,我们创建了一个名为 BandON 的新基准数据集。广泛的实验表明,我们的模型在先前的先进技术竞争者之上取得了更好的性能。

URL

https://arxiv.org/abs/2301.10922

PDF

https://arxiv.org/pdf/2301.10922.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot