Abstract
Stroke extraction of Chinese characters plays an important role in the field of character recognition and generation. The most existing character stroke extraction methods focus on image morphological features. These methods usually lead to errors of cross strokes extraction and stroke matching due to rarely using stroke semantics and prior information. In this paper, we propose a deep learning-based character stroke extraction method that takes semantic features and prior information of strokes into consideration. This method consists of three parts: image registration-based stroke registration that establishes the rough registration of the reference strokes and the target as prior information; image semantic segmentation-based stroke segmentation that preliminarily separates target strokes into seven categories; and high-precision extraction of single strokes. In the stroke registration, we propose a structure deformable image registration network to achieve structure-deformable transformation while maintaining the stable morphology of single strokes for character images with complex structures. In order to verify the effectiveness of the method, we construct two datasets respectively for calligraphy characters and regular handwriting characters. The experimental results show that our method strongly outperforms the baselines. Code is available at this https URL.
Abstract (translated)
中文字符的 stroke 提取在字符识别和生成领域扮演着重要的角色。目前,大多数字符 stroke 提取方法都关注图像形态学特征。这些方法通常会导致交叉字符提取和字符匹配的错误,因为它们很少使用字符语义特征和前信息。在本文中,我们提出了一种基于深度学习的字符 stroke 提取方法,考虑了字符语义特征和 stroke 前信息。这种方法由三部分组成:基于图像注册的字符 stroke 注册,建立参考字符和目标作为初步注册信息;基于图像语义分割的字符分割,初步地将目标字符分割为七类;以及高精度的单个字符提取。在字符注册中,我们提出了一种可重构的结构图像注册网络,以实现可重构的结构变化,同时保持字符图像中复杂结构中的单个字符稳定的形态学。为了验证方法的有效性,我们分别构建了两个数据集,分别是书法字符和常规手写字符。实验结果显示,我们的方法显著优于基准方法。代码可在 this https URL 中找到。
URL
https://arxiv.org/abs/2307.04341