Paper Reading AI Learner

An Ensemble of Neural Networks for Non-Linear Segmentation of Overlapped Cursive Script

2019-04-07 08:32:37
Amjad Rehman

Abstract

Precise character segmentation is the only solution towards higher Optical Character Recognition (OCR) accuracy. In cursive script, overlapped characters are serious issue in the process of character segmentations as characters are deprived from their discriminative parts using conventional linear segmentation strategy. Hence, non-linear segmentation is an utmost need to avoid loss of characters parts and to enhance character/script recognition accuracy. This paper presents an improved approach for non-linear segmentation of the overlapped characters in handwritten roman script. The proposed technique is composed of a sequence of heuristic rules based on geometrical features of characters to locate possible non-linear character boundaries in a cursive script word. However, to enhance efficiency, heuristic approach is integrated with trained ensemble neural network validation strategy for verification of character boundaries. Accordingly, correct boundaries are retained and incorrect are removed based on ensemble neural networks vote. Finally, based on verified valid segmentation points, characters are segmented non-linearly. For fair comparison CEDAR benchmark database is experimented. The experimental results are much better than conventional linear character segmentation techniques reported in the state of art. Ensemble neural network play vital role to enhance character segmentation accuracy as compared to individual neural networks.

Abstract (translated)

精确的字符分割是提高光学字符识别(OCR)精度的唯一解决方案。在草书中,重叠字符是字符分割过程中的一个重要问题,因为使用传统的线性分割策略可以将字符从识别部分去除。因此,非线性分割是避免字符部分丢失和提高字符/脚本识别精度的最大需要。本文提出了一种改进的手写体重叠字符非线性分割方法。该技术由一系列基于字符几何特征的启发式规则组成,用于在草书字中定位可能的非线性字符边界。然而,为了提高效率,启发式方法与训练的集成神经网络验证策略相结合,用于字符边界的验证。相应地,保留了正确的边界,并根据集合神经网络投票去除了不正确的边界。最后,基于验证的有效分割点,对字符进行非线性分割。为了公平比较,对Cedar基准数据库进行了试验。实验结果比现有的线性字符分割技术要好得多。与单个神经网络相比,集成神经网络在提高字符分割精度方面起着至关重要的作用。

URL

https://arxiv.org/abs/1904.12592

PDF

https://arxiv.org/pdf/1904.12592.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot