Paper Reading AI Learner

Writing Order Recovery in Complex and Long Static Handwriting

2024-06-05 12:23:17
Moises Diaz, Gioele Crispo, Antonio Parziale, Angelo Marcelli, Miguel A. Ferrer

Abstract

The order in which the trajectory is executed is a powerful source of information for recognizers. However, there is still no general approach for recovering the trajectory of complex and long handwriting from static images. Complex specimens can result in multiple pen-downs and in a high number of trajectory crossings yielding agglomerations of pixels (also known as clusters). While the scientific literature describes a wide range of approaches for recovering the writing order in handwriting, these approaches nevertheless lack a common evaluation metric. In this paper, we introduce a new system to estimate the order recovery of thinned static trajectories, which allows to effectively resolve the clusters and select the order of the executed pen-downs. We evaluate how knowing the starting points of the pen-downs affects the quality of the recovered writing. Once the stability and sensitivity of the system is analyzed, we describe a series of experiments with three publicly available databases, showing competitive results in all cases. We expect the proposed system, whose code is made publicly available to the research community, to reduce potential confusion when the order of complex trajectories are recovered, and this will in turn make the trajectories recovered to be viable for further applications, such as velocity estimation.

Abstract (translated)

轨迹执行的顺序是识别器的重要信息来源。然而,从静态图像中恢复复杂和长手写轨迹仍然没有一种通用的方法。复杂样品可能会导致多次 pen-down,产生大量轨迹交叉点,形成像素聚类(也称为簇)。虽然科学文献描述了从手写中恢复书写顺序的多种方法,但这些方法仍然缺乏一个共同的评估指标。在本文中,我们引入了一个新的系统来估计薄静态轨迹的恢复顺序,这使得有效地解决簇并选择执行的 pen-down 的顺序成为可能。我们评估了知道 pen-down 起点如何影响恢复书写顺序的质量。一旦系统的稳定性和敏感性被分析,我们在三个公开可用的数据库上进行了一系列实验,展示了所有情况下的 competitive 结果。我们预计,向研究社区公开提供代码的系统,当复杂轨迹的顺序被恢复时,将减少潜在的困惑,从而使恢复的轨迹具有进一步应用的价值,如速度估计。

URL

https://arxiv.org/abs/2406.03194

PDF

https://arxiv.org/pdf/2406.03194.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot