Paper Reading AI Learner

Structural-Spectral Graph Convolution with Evidential Edge Learning for Hyperspectral Image Clustering

2025-06-11 16:41:34
Jianhan Qi, Yuheng Jia, Hui Liu, Junhui Hou

Abstract

Hyperspectral image (HSI) clustering assigns similar pixels to the same class without any annotations, which is an important yet challenging task. For large-scale HSIs, most methods rely on superpixel segmentation and perform superpixel-level clustering based on graph neural networks (GNNs). However, existing GNNs cannot fully exploit the spectral information of the input HSI, and the inaccurate superpixel topological graph may lead to the confusion of different class semantics during information aggregation. To address these challenges, we first propose a structural-spectral graph convolutional operator (SSGCO) tailored for graph-structured HSI superpixels to improve their representation quality through the co-extraction of spatial and spectral features. Second, we propose an evidence-guided adaptive edge learning (EGAEL) module that adaptively predicts and refines edge weights in the superpixel topological graph. We integrate the proposed method into a contrastive learning framework to achieve clustering, where representation learning and clustering are simultaneously conducted. Experiments demonstrate that the proposed method improves clustering accuracy by 2.61%, 6.06%, 4.96% and 3.15% over the best compared methods on four HSI datasets. Our code is available at this https URL.

Abstract (translated)

高光谱图像(HSI)聚类的任务是在没有标注信息的情况下,将相似的像素归为同一类别,这是一项重要但具有挑战性的任务。对于大规模HSIs来说,大多数方法依赖于超像素分割,并基于图神经网络(GNNs)进行超像素级别的聚类。然而,现有的GNN无法充分利用输入HSI的光谱信息,不准确的超像素拓扑图可能在信息聚合过程中导致不同类别语义之间的混淆。 为了解决这些问题,我们首先提出了一种结构-光谱图卷积算子(SSGCO),它专门针对具有图结构的HSI超像素设计,通过同时提取空间和光谱特征来提高其表示质量。其次,我们提出了一个证据引导自适应边学习模块(EGAEL),该模块能够根据需要预测并细化超像素拓扑图中的边缘权重。我们将所提出的方法集成到对比学习框架中以实现聚类,在此框架下,表示学习与聚类可以同时进行。 实验表明,我们的方法在四个HSI数据集上将聚类精度分别提高了2.61%,6.06%,4.96%和3.15%,优于所有比较的方法。我们提供的代码可以在给定的URL中找到(原文中的具体链接被替换为“this https URL”)。

URL

https://arxiv.org/abs/2506.09920

PDF

https://arxiv.org/pdf/2506.09920.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot