Abstract
Object detection in still images has drawn a lot of attention over past few years, and with the advent of Deep Learning impressive performances have been achieved with numerous industrial applications. Most of these deep learning models rely on RGB images to localize and identify objects in the image. However in some application scenarii, images are compressed either for storage savings or fast transmission. Therefore a time consuming image decompression step is compulsory in order to apply the aforementioned deep models. To alleviate this drawback, we propose a fast deep architecture for object detection in JPEG images, one of the most widespread compression format. We train a neural network to detect objects based on the blockwise DCT (discrete cosine transform) coefficients {issued from} the JPEG compression algorithm. We modify the well-known Single Shot multibox Detector (SSD) by replacing its first layers with one convolutional layer dedicated to process the DCT inputs. Experimental evaluations on PASCAL VOC and industrial dataset comprising images of road traffic surveillance show that the model is about $2\times$ faster than regular SSD with promising detection performances. To the best of our knowledge, this paper is the first to address detection in compressed JPEG images.
Abstract (translated)
近些年来,静止图像中的目标检测引起了人们的广泛关注,随着深度学习的出现,大量的工业应用已经取得了令人印象深刻的效果。这些深度学习模型大多依赖于RGB图像来定位和识别图像中的对象。然而,在某些应用程序场景中,为了节省存储或快速传输,图像被压缩。因此,为了应用上述深度模型,必须执行耗时的图像解压缩步骤。为了克服这一缺点,我们提出了一种快速深入的jpeg图像对象检测体系结构,它是最广泛使用的压缩格式之一。我们训练一个神经网络来检测对象,基于jpeg压缩算法发布的分块DCT(离散余弦变换)系数。我们修改了著名的单点多盒探测器(SSD),将其第一层替换为一个专门处理DCT输入的卷积层。对Pascal VOC和包含道路交通监控图像的工业数据集的实验评估表明,该模型比常规固态硬盘快2倍,具有良好的检测性能。据我们所知,本文是第一个在压缩的jpeg图像中进行地址检测的。
URL
https://arxiv.org/abs/1904.08408