Abstract
The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer). This innovative approach organizes input data hierarchically into segments, each representing distinct abstraction levels, thereby enhancing processing efficiency for lengthy sequences. At each level, a dedicated transformer module is applied, effectively capturing both local and global context. Spatial and spectral information flow within the hierarchy facilitates communication and abstraction propagation. Integration of outputs from different levels culminates in the final input representation. Experimental results underscore the superiority of the proposed method over traditional approaches. Additionally, the incorporation of disjoint samples augments robustness and reliability, thereby highlighting the potential of our approach in advancing HSIC. The source code is available at this https URL.
Abstract (translated)
传统的Transformer模型在变长输入序列方面遇到了挑战,特别是在 Hyperspectral Image Classification(HSIC)中,导致效率和可扩展性方面的担忧。为了克服这个问题,我们提出了一个金字塔基于的层次Transformer(PyFormer)。这种创新方法将输入数据组织成层次结构,每个层次表示不同的抽象级别,从而提高处理长序列的效率。在每一层,都应用了一个专用的Transformer模块,有效地捕捉了局部和全局上下文。层次结构中的空间和频谱信息流动促进了沟通和抽象传播。不同层次输出的集成导致了最终的输入表示。实验结果证实了与传统方法相比,所提出的方法具有优越性。此外,结合离散样本的增强增强了鲁棒性和可靠性,从而突出了我们在推进 HSIC 方面的潜在能力。源代码可在此处访问:https://www.huaweicloud.com/cloud/models/pyformer/
URL
https://arxiv.org/abs/2404.14945