Abstract
Styled Handwritten Text Generation (Styled HTG) is an important task in document analysis, aiming to generate text images with the handwriting of given reference images. In recent years, there has been significant progress in the development of deep learning models for tackling this task. Being able to measure the performance of HTG models via a meaningful and representative criterion is key for fostering the development of this research topic. However, despite the current adoption of scores for natural image generation evaluation, assessing the quality of generated handwriting remains challenging. In light of this, we devise the Handwriting Distance (HWD), tailored for HTG evaluation. In particular, it works in the feature space of a network specifically trained to extract handwriting style features from the variable-lenght input images and exploits a perceptual distance to compare the subtle geometric features of handwriting. Through extensive experimental evaluation on different word-level and line-level datasets of handwritten text images, we demonstrate the suitability of the proposed HWD as a score for Styled HTG. The pretrained model used as backbone will be released to ease the adoption of the score, aiming to provide a valuable tool for evaluating HTG models and thus contributing to advancing this important research area.
Abstract (translated)
手写文本生成(Styled HTG)是文档分析中一个重要的任务,旨在生成给定参考图像的手写文本图像。近年来,在解决这个任务的深度学习模型的开发方面取得了显著的进展。通过一个有意义且具有代表性的标准来评估HTG模型的性能对于促进这个研究主题的发展至关重要。然而,尽管目前对于自然图像生成评估使用了一些分数,但评估生成手写的质量仍然具有挑战性。鉴于这一点,我们设计了一个专为HTG评估而设计的 Handwriting Distance(HWD)。 特别是,它在专门从变长输入图像中提取手写风格特征的网络的特征空间中工作,并利用感知距离来比较手写文本中微妙的几何特征。通过对手写文本图像的不同词级和行级数据集进行广泛的实验评估,我们证明了所提出的HWD可以作为Styled HTG的分数。作为基本骨架的预训练模型将发布,以促进对分数的采用,旨在为评估HTG模型提供有价值的工具,从而为发展这个重要研究领域做出贡献。
URL
https://arxiv.org/abs/2310.20316