Abstract
In this paper, we propose a pixel-wise detector named TextCohesion for scene text detection especially for those with arbitrary shapes. TextChohesion splits a text instance into 5 key components: a Text Skeleton, and four Directional pixel Regions. These components are easy to handle rather than directly control the entire text instance. We also introduce a confidence scoring mechanism to filter out the characters that are similar to texts. Our method can integrate text contexts intensively even grasp clues when it is very complex background. Experiments on challenging benchmarks demonstrate that our TextCohesion clearly outperform state-of-the-art methods and it achieves an F-measure of 84.6 and 86.3 on Total-Text and SCUT-CTW1500 respectively.
Abstract (translated)
本文提出了一种基于像素的文本内聚检测器,特别是对于任意形状的场景文本检测。textchhesion将文本实例拆分为5个关键组件:文本骨架和四个方向像素区域。这些组件易于处理,而不是直接控制整个文本实例。我们还引入了一个信任度评分机制来过滤掉与文本相似的字符。在背景非常复杂的情况下,我们的方法可以集中地整合文本上下文,甚至可以抓住线索。对具有挑战性的基准进行的实验表明,我们的文本内聚性明显优于最先进的方法,并且在总文本和scut-ctw1500上分别达到了84.6和86.3的F度量。
URL
https://arxiv.org/abs/1904.12640