Paper Reading AI Learner

Prototype Learning-Based Few-Shot Segmentation for Low-Light Crack on Concrete Structures


Abstract

Crack detection is critical for concrete infrastructure safety, but real-world cracks often appear in low-light environments like tunnels and bridge undersides, degrading computer vision segmentation accuracy. Pixel-level annotation of low-light crack images is extremely time-consuming, yet most deep learning methods require large, well-illuminated datasets. We propose a dual-branch prototype learning network integrating Retinex theory with few-shot learning for low-light crack segmentation. Retinex-based reflectance components guide illumination-invariant global representation learning, while metric learning reduces dependence on large annotated datasets. We introduce a cross-similarity prior mask generation module that computes high-dimensional similarities between query and support features to capture crack location and structure, and a multi-scale feature enhancement module that fuses multi-scale features with the prior mask to alleviate spatial inconsistency. Extensive experiments on multiple benchmarks demonstrate consistent state-of-the-art performance under low-light conditions. Code: this https URL.

Abstract (translated)

裂缝检测对于混凝土基础设施的安全至关重要,但在隧道和桥下等低光照环境中出现的裂缝会降低计算机视觉分割精度。对低光环境下裂纹图像进行像素级别的标注极其耗时,而大多数深度学习方法需要大规模且照明良好的数据集。我们提出了一种基于Retinex理论与少量样本学习相结合的双分支原型网络,用于低光环境下的裂缝分割。该网络利用基于Retinex的反射成分来指导光照不变性的全局表征学习,并通过度量学习减少对大量标注数据集的依赖。此外,我们引入了一个跨相似性先验掩码生成模块,计算查询特征和支持特征之间的高维相似性,以捕捉裂纹的位置和结构;还设计了多尺度特征增强模块,将多尺度特征与先验掩模融合,从而缓解空间不一致性问题。在多个基准测试中的广泛实验表明,在低光条件下该方法具有持续的优越性能。代码链接:[请在此处插入实际URL]。 (注:原文中提到的具体链接地址未提供,可以替换成实际可用的代码库或资源链接)

URL

https://arxiv.org/abs/2601.13059

PDF

https://arxiv.org/pdf/2601.13059.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot