Abstract
Document image quality assessment (DIQA) is an important and challenging problem in real applications. In order to predict the quality scores of document images, this paper proposes a novel no-reference DIQA method based on character gradient, where the OCR accuracy is used as a ground-truth quality metric. Character gradient is computed on character patches detected with the maximally stable extremal regions (MSER) based method. Character patches are essentially significant to character recognition and therefore suitable for use in estimating document image quality. Experiments on a benchmark dataset show that the proposed method outperforms the state-of-the-art methods in estimating the quality score of document images.
Abstract (translated)
文档图像质量评估(DIQA)是实际应用中的一个重要且具有挑战性的问题。为了预测文档图像的质量得分,本文提出了一种基于字符梯度的新型无参考DIQA方法,其中OCR精度被用作地面实况质量度量。使用基于最大稳定极值区域(MSER)的方法检测的字符块上计算字符梯度。字符补丁对字符识别具有重要意义,因此适用于估计文档图像质量。基准数据集上的实验表明,所提出的方法在估计文档图像的质量分数方面优于最先进的方法。
URL
https://arxiv.org/abs/1807.04047