Paper Reading AI Learner

TextCohesion: Detecting Text for Arbitrary Shapes

2019-04-22 13:21:38
Weijia Wu, Jici Xing, Hong Zhou

Abstract

In this paper, we propose a pixel-wise detector named TextCohesion for scene text detection especially for those with arbitrary shapes. TextChohesion splits a text instance into 5 key components: a Text Skeleton, and four Directional pixel Regions. These components are easy to handle rather than directly control the entire text instance. We also introduce a confidence scoring mechanism to filter out the characters that are similar to texts. Our method can integrate text contexts intensively even grasp clues when it is very complex background. Experiments on challenging benchmarks demonstrate that our TextCohesion clearly outperform state-of-the-art methods and it achieves an F-measure of 84.6 and 86.3 on Total-Text and SCUT-CTW1500 respectively.

Abstract (translated)

本文提出了一种基于像素的文本内聚检测器,特别是对于任意形状的场景文本检测。textchhesion将文本实例拆分为5个关键组件:文本骨架和四个方向像素区域。这些组件易于处理,而不是直接控制整个文本实例。我们还引入了一个信任度评分机制来过滤掉与文本相似的字符。在背景非常复杂的情况下,我们的方法可以集中地整合文本上下文,甚至可以抓住线索。对具有挑战性的基准进行的实验表明,我们的文本内聚性明显优于最先进的方法,并且在总文本和scut-ctw1500上分别达到了84.6和86.3的F度量。

URL

https://arxiv.org/abs/1904.12640

PDF

https://arxiv.org/pdf/1904.12640.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot