Paper Reading AI Learner

Wetland mapping from sparse annotations with satellite image time series and temporal-aware segment anything model

2026-01-16 16:10:32
Shuai Yuan, Tianwu Lin, Shuang Chen, Yu Xia, Peng Qin, Xiangyu Liu, Xiaoqing Xu, Nan Xu, Hongsheng Zhang, Jie Wang, Peng Gong

Abstract

Accurate wetland mapping is essential for ecosystem monitoring, yet dense pixel-level annotation is prohibitively expensive and practical applications usually rely on sparse point labels, under which existing deep learning models perform poorly, while strong seasonal and inter-annual wetland dynamics further render single-date imagery inadequate and lead to significant mapping errors; although foundation models such as SAM show promising generalization from point prompts, they are inherently designed for static images and fail to model temporal information, resulting in fragmented masks in heterogeneous wetlands. To overcome these limitations, we propose WetSAM, a SAM-based framework that integrates satellite image time series for wetland mapping from sparse point supervision through a dual-branch design, where a temporally prompted branch extends SAM with hierarchical adapters and dynamic temporal aggregation to disentangle wetland characteristics from phenological variability, and a spatial branch employs a temporally constrained region-growing strategy to generate reliable dense pseudo-labels, while a bidirectional consistency regularization jointly optimizes both branches. Extensive experiments across eight global regions of approximately 5,000 km2 each demonstrate that WetSAM substantially outperforms state-of-the-art methods, achieving an average F1-score of 85.58%, and delivering accurate and structurally consistent wetland segmentation with minimal labeling effort, highlighting its strong generalization capability and potential for scalable, low-cost, high-resolution wetland mapping.

Abstract (translated)

准确的湿地测绘对于生态系统监测至关重要,但密集的像素级标注成本高昂且实用性差,实际应用通常依赖于稀疏点标记,在这种情况下现有的深度学习模型表现不佳;此外,强烈的季节性和年际间的湿地动态变化进一步使得单一日期的影像不足以应对这些挑战,并导致显著的地图绘制错误。虽然像SAM这样的基础模型从点提示中展示出令人鼓舞的一般化能力,但它们本质上是为静态图像设计的,无法建模时间信息,这在异质性湿地中会导致碎片化的掩膜。 为了克服这些限制,我们提出了WetSAM,这是一个基于SAM的框架,通过双分支设计整合卫星影像的时间序列来进行从稀疏点监督的湿地测绘。其中,一个受时间提示驱动的分支扩展了SAM,利用分层适配器和动态时间聚合来分解湿地特征与物候变化;而空间分支则采用一种受限于时间策略的区域生长方法生成可靠的密集伪标签。双向一致性正则化同时优化两个分支。 在八个全球区域(每个区域约为5,000平方公里)进行广泛的实验后,我们发现WetSAM显著优于现有的最先进的方法,在所有测试区域达到了平均F1分数85.58%,并提供了准确且结构一致的湿地分割结果,而仅需最小化的标注工作量。这强调了其强大的泛化能力以及在大规模低成本高分辨率湿地测绘中的巨大潜力。

URL

https://arxiv.org/abs/2601.11400

PDF

https://arxiv.org/pdf/2601.11400.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot