Abstract
The advancement of text shape representations towards compactness has enhanced text detection and spotting performance, but at a high annotation cost. Current models use single-point annotations to reduce costs, yet they lack sufficient localization information for downstream applications. To overcome this limitation, we introduce Point2Polygon, which can efficiently transform single-points into compact polygons. Our method uses a coarse-to-fine process, starting with creating and selecting anchor points based on recognition confidence, then vertically and horizontally refining the polygon using recognition information to optimize its shape. We demonstrate the accuracy of the generated polygons through extensive experiments: 1) By creating polygons from ground truth points, we achieved an accuracy of 82.0% on ICDAR 2015; 2) In training detectors with polygons generated by our method, we attained 86% of the accuracy relative to training with ground truth (GT); 3) Additionally, the proposed Point2Polygon can be seamlessly integrated to empower single-point spotters to generate polygons. This integration led to an impressive 82.5% accuracy for the generated polygons. It is worth mentioning that our method relies solely on synthetic recognition information, eliminating the need for any manual annotation beyond single points.
Abstract (translated)
文本形状表示的进步使得文本检测和斑点检测性能得到了提高,但需要高昂的标注成本。当前的模型使用单点标注来降低成本,然而它们缺乏足够的关键位置信息,对于下游应用来说至关重要。为了克服这个限制,我们引入了点2面体,它可以通过有效地将单点转换为紧凑的多边形来提高文本检测和斑点检测的性能。我们的方法采用粗到细的过程,首先根据识别信心创建和选择锚点,然后使用识别信息垂直和水平优化多边形的形状。我们通过广泛的实验来证明生成的多边形的准确性:1)通过从真实点创建多边形,我们在2015年ICDAR上实现了82.0%的准确度;2)在用我们方法生成的检测器上进行训练时,我们实现了与用真实点进行训练的86%的准确度相对;3)此外,点2面体可以轻松地与其他单点检测器集成,使其生成多边形。这种集成导致生成的多边形具有令人印象深刻的82.5%的准确度。值得注意的是,我们的方法仅依赖于合成识别信息,消除了对任何手动标注的需求,从而实现单点检测器。
URL
https://arxiv.org/abs/2312.13778