Abstract
Bottom-up text detection methods play an important role in arbitrary-shape scene text detection but there are two restrictions preventing them from achieving their great potential, i.e., 1) the accumulation of false text segment detections, which affects subsequent processing, and 2) the difficulty of building reliable connections between text segments. Targeting these two problems, we propose a novel approach, named ``MorphText", to capture the regularity of texts by embedding deep morphology for arbitrary-shape text detection. Towards this end, two deep morphological modules are designed to regularize text segments and determine the linkage between them. First, a Deep Morphological Opening (DMOP) module is constructed to remove false text segment detections generated in the feature extraction process. Then, a Deep Morphological Closing (DMCL) module is proposed to allow text instances of various shapes to stretch their morphology along their most significant orientation while deriving their connections. Extensive experiments conducted on four challenging benchmark datasets (CTW1500, Total-Text, MSRA-TD500 and ICDAR2017) demonstrate that our proposed MorphText outperforms both top-down and bottom-up state-of-the-art arbitrary-shape scene text detection approaches.
Abstract (translated)
自下而上的文本检测方法在任意形状场景文本检测中扮演着重要的角色,但它们无法完全实现其巨大潜力,这是因为1)积累假文本分割检测,影响了后续处理,2)文本段之间的可靠连接难度。为了解决这两个问题,我们提出了一个名为“MorphText”的新方法,通过将深度形态学嵌入任意形状文本检测中,捕捉文本的规律。为实现这一目标,我们设计了两项功能强大的 deep morphological 模块来对文本段进行规范化和确定它们之间的联系。首先,我们构建了一个 Deep Morphological Opening (DMOP) 模块,用于消除在特征提取过程中产生的假文本分割检测。然后,我们提出了一个 Deep Morphological Closing (DMCL) 模块,允许各种形状的文本实例在其最显著的方向上伸展形态学,同时确定它们之间的联系。在四个具有挑战性的基准数据集(CTW1500,Total-Text,MSRA-TD500 和 ICDAR2017)上的大量实验证明,与自上而下和自下而上的状态最先进的任意形状场景文本检测方法相比,我们提出的 MorphText 具有优越的性能。
URL
https://arxiv.org/abs/2404.17151