Paper Reading AI Learner

Optimization of Image Processing Algorithms for Character Recognition in Cultural Typewritten Documents

2023-11-27 11:44:46
Mariana Dias, Carla Teixeira Lopes

Abstract

Linked Data is used in various fields as a new way of structuring and connecting data. Cultural heritage institutions have been using linked data to improve archival descriptions and facilitate the discovery of information. Most archival records have digital representations of physical artifacts in the form of scanned images that are non-machine-readable. Optical Character Recognition (OCR) recognizes text in images and translates it into machine-encoded text. This paper evaluates the impact of image processing methods and parameter tuning in OCR applied to typewritten cultural heritage documents. The approach uses a multi-objective problem formulation to minimize Levenshtein edit distance and maximize the number of words correctly identified with a non-dominated sorting genetic algorithm (NSGA-II) to tune the methods' parameters. Evaluation results show that parameterization by digital representation typology benefits the performance of image pre-processing algorithms in OCR. Furthermore, our findings suggest that employing image pre-processing algorithms in OCR might be more suitable for typologies where the text recognition task without pre-processing does not produce good results. In particular, Adaptive Thresholding, Bilateral Filter, and Opening are the best-performing algorithms for the theatre plays' covers, letters, and overall dataset, respectively, and should be applied before OCR to improve its performance.

Abstract (translated)

作为一种新的数据组织和连接数据的方法,链接数据在各种领域得到了广泛应用。文化遗产机构已经使用链接数据来改善档案馆描述并促进信息的发现。大多数档案馆记录的数字形式是扫描图像,这些图像无法被机器阅读。光学字符识别(OCR)识别图像中的文本并将其转换为机器编码文本。本文评估了应用于手写文化遗产文档的图像处理方法和参数调整对OCR的影响。该方法使用多目标问题求解来最小化Levenshtein编辑距离并最大化非支配排序遗传算法(NSGA-II)正确识别非支配排序单词的数量,以调整方法参数。评估结果显示,通过数字表示类型学对参数进行调整可以提高OCR前处理算法的性能。此外,我们的研究结果表明,在OCR中应用图像预处理算法可能更适用于那些没有预处理文本识别任务产生良好结果的字体。特别是,Adaptive Thresholding、Bilateral Filter和Opening是剧院剧本封面、信件和整体数据集的最佳表现算法,应在与OCR一起应用前进行改善其性能。

URL

https://arxiv.org/abs/2311.15740

PDF

https://arxiv.org/pdf/2311.15740.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot