Paper Reading AI Learner

Text in the Dark: Extremely Low-Light Text Image Enhancement

2024-04-22 12:39:12
Che-Tsung Lin, Chun Chet Ng, Zhi Qin Tan, Wan Jun Nah, Xinyu Wang, Jie Long Kew, Pohao Hsu, Shang Hong Lai, Chee Seng Chan, Christopher Zach

Abstract

Extremely low-light text images are common in natural scenes, making scene text detection and recognition challenging. One solution is to enhance these images using low-light image enhancement methods before text extraction. However, previous methods often do not try to particularly address the significance of low-level features, which are crucial for optimal performance on downstream scene text tasks. Further research is also hindered by the lack of extremely low-light text datasets. To address these limitations, we propose a novel encoder-decoder framework with an edge-aware attention module to focus on scene text regions during enhancement. Our proposed method uses novel text detection and edge reconstruction losses to emphasize low-level scene text features, leading to successful text extraction. Additionally, we present a Supervised Deep Curve Estimation (Supervised-DCE) model to synthesize extremely low-light images based on publicly available scene text datasets such as ICDAR15 (IC15). We also labeled texts in the extremely low-light See In the Dark (SID) and ordinary LOw-Light (LOL) datasets to allow for objective assessment of extremely low-light image enhancement through scene text tasks. Extensive experiments show that our model outperforms state-of-the-art methods in terms of both image quality and scene text metrics on the widely-used LOL, SID, and synthetic IC15 datasets. Code and dataset will be released publicly at this https URL.

Abstract (translated)

极低光文本图像在自然场景中很常见,使得场景文本检测和识别变得具有挑战性。一种解决方案是在文本提取之前使用低光图像增强方法增强这些图像。然而,之前的方法通常没有特别关注低级别特征的重要性,这些特征对于下游场景文本任务具有关键作用。此外,缺乏极低光文本数据集也进一步阻碍了进一步的研究。为了克服这些限制,我们提出了一个新颖的编码器-解码器框架,配备边缘感知注意模块,以在增强过程中关注场景文本区域。我们的方法利用新的文本检测和边缘重构损失来强调低级别场景文本特征,从而实现成功的文本提取。此外,我们还提出了一个基于已知场景文本数据集如ICDAR15( see In the Dark,SID)的监督深度曲线估计(Supervised-DCE)模型,用于基于公开可用的场景文本数据合成极低光图像。我们还对极低光See In the Dark(SID)和普通Low-Light(LOL)数据集中的文本进行了标注,以使场景文本任务通过场景文本评估极低光图像增强。大量的实验结果表明,我们的模型在广泛使用的LOL、SID和合成IC15数据集上的图像质量和场景文本指标都优于最先进的方法。代码和数据集将在这个https:// URL上发布。

URL

https://arxiv.org/abs/2404.14135

PDF

https://arxiv.org/pdf/2404.14135.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot