Paper Reading AI Learner

Stroke Extraction of Chinese Character Based on Deep Structure Deformable Image Registration

2023-07-10 04:50:17
Meng Li, Yahan Yu, Yi Yang, Guanghao Ren, Jian Wang

Abstract

Stroke extraction of Chinese characters plays an important role in the field of character recognition and generation. The most existing character stroke extraction methods focus on image morphological features. These methods usually lead to errors of cross strokes extraction and stroke matching due to rarely using stroke semantics and prior information. In this paper, we propose a deep learning-based character stroke extraction method that takes semantic features and prior information of strokes into consideration. This method consists of three parts: image registration-based stroke registration that establishes the rough registration of the reference strokes and the target as prior information; image semantic segmentation-based stroke segmentation that preliminarily separates target strokes into seven categories; and high-precision extraction of single strokes. In the stroke registration, we propose a structure deformable image registration network to achieve structure-deformable transformation while maintaining the stable morphology of single strokes for character images with complex structures. In order to verify the effectiveness of the method, we construct two datasets respectively for calligraphy characters and regular handwriting characters. The experimental results show that our method strongly outperforms the baselines. Code is available at this https URL.

Abstract (translated)

中文字符的 stroke 提取在字符识别和生成领域扮演着重要的角色。目前,大多数字符 stroke 提取方法都关注图像形态学特征。这些方法通常会导致交叉字符提取和字符匹配的错误,因为它们很少使用字符语义特征和前信息。在本文中,我们提出了一种基于深度学习的字符 stroke 提取方法,考虑了字符语义特征和 stroke 前信息。这种方法由三部分组成:基于图像注册的字符 stroke 注册,建立参考字符和目标作为初步注册信息;基于图像语义分割的字符分割,初步地将目标字符分割为七类;以及高精度的单个字符提取。在字符注册中,我们提出了一种可重构的结构图像注册网络,以实现可重构的结构变化,同时保持字符图像中复杂结构中的单个字符稳定的形态学。为了验证方法的有效性,我们分别构建了两个数据集,分别是书法字符和常规手写字符。实验结果显示,我们的方法显著优于基准方法。代码可在 this https URL 中找到。

URL

https://arxiv.org/abs/2307.04341

PDF

https://arxiv.org/pdf/2307.04341.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot