Paper Reading AI Learner

Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation

2024-04-25 15:33:23
Huiyu Zhai, Mo Chen, Xingxing Yang, Gusheng Kang

Abstract

The NIR-to-RGB spectral domain translation is a formidable task due to the inherent spectral mapping ambiguities within NIR inputs and RGB outputs. Thus, existing methods fail to reconcile the tension between maintaining texture detail fidelity and achieving diverse color variations. In this paper, we propose a Multi-scale HSV Color Feature Embedding Network (MCFNet) that decomposes the mapping process into three sub-tasks, including NIR texture maintenance, coarse geometry reconstruction, and RGB color prediction. Thus, we propose three key modules for each corresponding sub-task: the Texture Preserving Block (TPB), the HSV Color Feature Embedding Module (HSV-CFEM), and the Geometry Reconstruction Module (GRM). These modules contribute to our MCFNet methodically tackling spectral translation through a series of escalating resolutions, progressively enriching images with color and texture fidelity in a scale-coherent fashion. The proposed MCFNet demonstrates substantial performance gains over the NIR image colorization task. Code is released at: this https URL.

Abstract (translated)

NIR-to-RGB spectral domain translation是一个具有挑战性的任务,因为NIR输入和RGB输出的固有光谱映射歧义。因此,现有的方法无法在保持纹理细节保真度和实现多样色彩变化之间实现和谐。在本文中,我们提出了一种多尺度HSV颜色特征嵌入网络(MCFNet),将映射过程分解为包括NIR纹理维护、粗几何重建和RGB颜色预测三个子任务的三个子任务。因此,我们提出了每个相应子任务的关键模块:纹理保留模块(TPB)、HSV颜色特征嵌入模块(HSV-CFEM)和几何重建模块(GRM)。这些模块通过一系列逐渐升高的分辨率,以尺度和谐的方式贡献于我们的MCFNet方法,通过一系列自适应纹理映射,实现对NIR图像颜色化的巨大性能提升。所提出的MCFNet在NIR图像颜色化任务中取得了显著的性能提升。代码发布在:https://这个URL。

URL

https://arxiv.org/abs/2404.16685

PDF

https://arxiv.org/pdf/2404.16685.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot