Abstract
The foundation model has recently garnered significant attention due to its potential to revolutionize the field of visual representation learning in a self-supervised manner. While most foundation models are tailored to effectively process RGB images for various visual tasks, there is a noticeable gap in research focused on spectral data, which offers valuable information for scene understanding, especially in remote sensing (RS) applications. To fill this gap, we created for the first time a universal RS foundation model, named SpectralGPT, which is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT). Compared to existing foundation models, SpectralGPT 1) accommodates input images with varying sizes, resolutions, time series, and regions in a progressive training fashion, enabling full utilization of extensive RS big data; 2) leverages 3D token generation for spatial-spectral coupling; 3) captures spectrally sequential patterns via multi-target reconstruction; 4) trains on one million spectral RS images, yielding models with over 600 million parameters. Our evaluation highlights significant performance improvements with pretrained SpectralGPT models, signifying substantial potential in advancing spectral RS big data applications within the field of geoscience across four downstream tasks: single/multi-label scene classification, semantic segmentation, and change detection.
Abstract (translated)
基础模型因其在自监督方式下可能彻底颠覆视觉表示学习领域的潜在影响而最近引起了广泛关注。虽然大多数基础模型都是为有效地处理各种视觉任务而设计的,但在关注光谱数据的研究方面存在明显的差距,这对场景理解,尤其是在遥感和(RS)应用中,具有重要的价值。为了填补这一空白,我们创建了第一个通用 RS 基础模型,名为 SpectralGPT,它专门使用一种新颖的 3D 生成预训练变换器(GPT)处理光谱 RS 图像。与现有基础模型相比,SpectralGPT 1) 按 progressive training 的方式适应不同大小、分辨率、时间序列和区域的输入图像,实现对 RS 大数据的充分利用;2) 利用 3D 词生成进行空间-光谱耦合;3) 通过多目标重构捕捉光谱序列模式;4) 在一百万个光谱 RS 图像上训练,产生了具有超过 600 百万参数的模型。我们的评估显示,预训练的 SpectralGPT 模型在性能上取得了显著的改进,这表明在地质科学领域中,通过推动 RS 大数据应用的发展,具有巨大的潜力。 尽管在某些方面,SpectralGPT 可能无法完全替代现有的基础模型,但它在尝试解决当前难以解决的问题方面确实展现出了巨大的潜力。
URL
https://arxiv.org/abs/2311.07113