Looking Outside the Window: Wider-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images

2021-06-29 23:41:54

Lei Ding, Dong Lin, Shaofu Lin, Jing Zhang, Xiaojie Cui, Yuebin Wang, Hao Tang, Lorenzo Bruzzone

arXiv_CV

arXiv_CV Segmentation Semantic_Segmentation Relation Transformer Pose Action

Abstract
Abstract (translated)
URL
PDF

Abstract

Long-range context information is crucial for the semantic segmentation of High-Resolution (HR) Remote Sensing Images (RSIs). The image cropping operations, commonly used for training neural networks, limit the perception of long-range context information in large RSIs. To break this limitation, we propose a Wider-Context Network (WiCNet) for the semantic segmentation of HR RSIs. In the WiCNet, apart from a conventional feature extraction network to aggregate the local information, an extra context branch is designed to explicitly model the context information in a larger image area. The information between the two branches is communicated through a Context Transformer, which is a novel design derived from the Vision Transformer to model the long-range context correlations. Ablation studies and comparative experiments conducted on several benchmark datasets prove the effectiveness of the proposed method. Additionally, we present a new Beijing Land-Use (BLU) dataset. This is a large-scale HR satellite dataset provided with high-quality and fine-grained reference labels, which we hope will boost future studies in this field.

Abstract (translated)

URL

https://arxiv.org/abs/2106.15754

PDF

https://arxiv.org/pdf/2106.15754.pdf