Paper Reading AI Learner

Fourier-enhanced Implicit Neural Fusion Network for Multispectral and Hyperspectral Image Fusion

2024-04-23 16:14:20
Yu-Jie Liang, Zihan Cao, Liang-Jian Deng, Xiao Wu

Abstract

Recently, implicit neural representations (INR) have made significant strides in various vision-related domains, providing a novel solution for Multispectral and Hyperspectral Image Fusion (MHIF) tasks. However, INR is prone to losing high-frequency information and is confined to the lack of global perceptual capabilities. To address these issues, this paper introduces a Fourier-enhanced Implicit Neural Fusion Network (FeINFN) specifically designed for MHIF task, targeting the following phenomena: The Fourier amplitudes of the HR-HSI latent code and LR-HSI are remarkably similar; however, their phases exhibit different patterns. In FeINFN, we innovatively propose a spatial and frequency implicit fusion function (Spa-Fre IFF), helping INR capture high-frequency information and expanding the receptive field. Besides, a new decoder employing a complex Gabor wavelet activation function, called Spatial-Frequency Interactive Decoder (SFID), is invented to enhance the interaction of INR features. Especially, we further theoretically prove that the Gabor wavelet activation possesses a time-frequency tightness property that favors learning the optimal bandwidths in the decoder. Experiments on two benchmark MHIF datasets verify the state-of-the-art (SOTA) performance of the proposed method, both visually and quantitatively. Also, ablation studies demonstrate the mentioned contributions. The code will be available on Anonymous GitHub (https://anonymous.4open.science/r/FeINFN-15C9/) after possible acceptance.

Abstract (translated)

近年来,隐含神经表示(INR)在各种视觉相关领域取得了显著的进步,为多光谱和超光谱图像融合(MHIF)任务提供了新的解决方案。然而,INR容易丢失高频信息,并且局限于缺乏全局感知能力。为了应对这些问题,本文提出了一种专门为MHIF任务设计的傅里叶增强隐含神经融合网络(FeINFN),旨在解决以下现象:HR-HSI隐含码的傅里叶振幅和LR-HSI隐含码的傅里叶振幅显著相似;然而,它们的相位表现出不同的模式。在FeINFN中,我们创新地提出了一种空间和频率隐含融合函数(Spa-Fre IFF),帮助INR捕获高频信息并扩大感受野。此外,还提出了一种新的解码器,采用复杂的高尔顿卷积激活函数,称为空间频率交互解码器(SFID),以增强INR特征之间的交互。特别地,我们进一步理论证明,高尔顿卷积激活具有时间-频率紧密性特性,有利于在解码器中学习最优带宽。在两个基准MHIF数据集上的实验证实了所提出方法的最先进性能,无论是在视觉方面还是量化方面。此外,消融研究还证明了上述贡献。代码将在经过可能接受后,在匿名GitHub上发布(https://anonymous.4open.science/r/FeINFN-15C9/)。

URL

https://arxiv.org/abs/2404.15174

PDF

https://arxiv.org/pdf/2404.15174.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot