Paper Reading AI Learner

Open Set Chinese Character Recognition using Multi-typed Attributes

2018-08-27 18:53:31
Sheng He, Lambert Schomaker

Abstract

Recognition of Off-line Chinese characters is still a challenging problem, especially in historical documents, not only in the number of classes extremely large in comparison to contemporary image retrieval methods, but also new unseen classes can be expected under open learning conditions (even for CNN). Chinese character recognition with zero or a few training samples is a difficult problem and has not been studied yet. In this paper, we propose a new Chinese character recognition method by multi-type attributes, which are based on pronunciation, structure and radicals of Chinese characters, applied to character recognition in historical books. This intermediate attribute code has a strong advantage over the common `one-hot' class representation because it allows for understanding complex and unseen patterns symbolically using attributes. First, each character is represented by four groups of attribute types to cover a wide range of character possibilities: Pinyin label, layout structure, number of strokes, three different input methods such as Cangjie, Zhengma and Wubi, as well as a four-corner encoding method. A convolutional neural network (CNN) is trained to learn these attributes. Subsequently, characters can be easily recognized by these attributes using a distance metric and a complete lexicon that is encoded in attribute space. We evaluate the proposed method on two open data sets: printed Chinese character recognition for zero-shot learning, historical characters for few-shot learning and a closed set: handwritten Chinese characters. Experimental results show a good general classification of seen classes but also a very promising generalization ability to unseen characters.

Abstract (translated)

对离线汉字的识别仍然是一个具有挑战性的问题,特别是在历史文献中,不仅与当代图像检索方法相比,它的数量非常大,而且在开放学习条件下也可以预期新的看不见的类(即使对于CNN)。零训练样本或少量训练样本的汉字识别是一个难题,尚未研究过。在本文中,我们提出了一种基于汉字的发音,结构和字根的多类型属性的新汉字识别方法,应用于历史书籍中的字符识别。这个中间属性代码比常见的“one-hot”类表示具有强大的优势,因为它允许使用属性以符号方式理解复杂和不可见的模式。首先,每个字符由四组属性类型表示,以涵盖广泛的字符可能性:拼音标签,布局结构,笔画数,三种不同的输入方法,如仓颉,正马和五笔,以及一个四角编码方法。训练卷积神经网络(CNN)以学习这些属性。随后,使用距离度量和在属性空间中编码的完整词典,可以通过这些属性容易地识别字符。我们在两个开放数据集上评估所提出的方法:零射击学习的印刷汉字识别,少射击学习的历史特征和闭合集:手写汉字。实验结果表明,对于看到的类具有良好的一般分类,但对于看不见的字符也具有非常有希望的泛化能力。

URL

https://arxiv.org/abs/1808.08993

PDF

https://arxiv.org/pdf/1808.08993.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot