Abstract
Change detection in remote sensing images is essential for tracking environmental changes on the Earth's surface. Despite the success of vision transformers (ViTs) as backbones in numerous computer vision applications, they remain underutilized in change detection, where convolutional neural networks (CNNs) continue to dominate due to their powerful feature extraction capabilities. In this paper, our study uncovers ViTs' unique advantage in discerning large-scale changes, a capability where CNNs fall short. Capitalizing on this insight, we introduce ChangeViT, a framework that adopts a plain ViT backbone to enhance the performance of large-scale changes. This framework is supplemented by a detail-capture module that generates detailed spatial features and a feature injector that efficiently integrates fine-grained spatial information into high-level semantic learning. The feature integration ensures that ChangeViT excels in both detecting large-scale changes and capturing fine-grained details, providing comprehensive change detection across diverse scales. Without bells and whistles, ChangeViT achieves state-of-the-art performance on three popular high-resolution datasets (i.e., LEVIR-CD, WHU-CD, and CLCD) and one low-resolution dataset (i.e., OSCD), which underscores the unleashed potential of plain ViTs for change detection. Furthermore, thorough quantitative and qualitative analyses validate the efficacy of the introduced modules, solidifying the effectiveness of our approach. The source code is available at this https URL.
Abstract (translated)
远程 sensing图像中的变化检测对于在地球表面跟踪环境变化至关重要。尽管在计算机视觉应用中视觉Transformer(ViT)作为后端的成功已经不言而喻,但在变化检测中,由于卷积神经网络(CNN)具有强大的特征提取能力,它们仍然没有被充分利用。在这篇论文中,我们的研究揭示了ViT在辨别大规模变化方面独特的优势,而CNN在这些方面则显得不足。借此机会,我们引入了ChangeViT框架,该框架采用一个简单的ViT后端来增强大规模变化检测的性能。此外,还加入了一个详细捕捉模块,用于生成详细的空间特征,以及一个特征注入器,用于将细粒度空间信息有效地整合到高级语义学习中。特征整合确保了ChangeViT在检测大规模变化和捕捉细粒度细节方面都表现出色,实现了不同尺度全面的变检测。没有花哨的装饰,ChangeViT在三个流行的高分辨率数据集(即LEVIR-CD、WHU-CD和LCLD)和一个低分辨率数据集(即OSCD)上的表现达到最先进水平,这表明简单的ViT在变检测方面具有很大的潜力。此外,定量和定性分析证实了引入的模块的有效性,巩固了我们的方法的有效性。源代码可在此链接处获取:https://github.com/your_username/ChangeViT
URL
https://arxiv.org/abs/2406.12847