A Framework for Real-time Object Detection and Image Restoration

Abstract
Abstract (translated)
URL
PDF

Abstract

Object detection and single image super-resolution are classic problems in computer vision (CV). The object detection task aims to recognize the objects in input images, while the image restoration task aims to reconstruct high quality images from given low quality images. In this paper, a two-stage framework for object detection and image restoration is proposed. The first stage uses YOLO series algorithms to complete the object detection and then performs image cropping. In the second stage, this work improves Swin Transformer and uses the new proposed algorithm to connect the Swin Transformer layer to design a new neural network architecture. We name the newly proposed network for image restoration SwinOIR. This work compares the model performance of different versions of YOLO detection algorithms on MS COCO dataset and Pascal VOC dataset, demonstrating the suitability of different YOLO network models for the first stage of the framework in different scenarios. For image super-resolution task, it compares the model performance of using different methods of connecting Swin Transformer layers and design different sizes of SwinOIR for use in different life scenarios. Our implementation code is released at this https URL.

Abstract (translated)

物体检测和单张图像超分辨率是计算机视觉(CV)中的经典问题。物体检测任务旨在从输入图像中识别物体,而图像修复任务旨在从给定低质量图像中重构高质量的图像。在本文中,提出了一种物体检测和图像修复的二阶段框架。在第一阶段,使用YOLO系列算法完成物体检测,然后进行图像裁剪。在第二阶段,改进了 Swin Transformer,并使用新提出的算法连接 Swin Transformer 层来设计新的神经网络架构。我们提出了名为 SwinOIR 的图像修复网络。该工作对不同版本的YOLO检测算法在MS COCO数据和Pascal VOC数据集上的性能进行了比较,证明了不同YOLO网络模型在不同场景下的适用性。对于图像超分辨率任务,该工作比较了使用不同连接 Swin Transformer 层的方法以及设计不同大小的 SwinOIR 用于不同生命场景的性能。我们的实现代码在此httpsURL上发布。

URL

https://arxiv.org/abs/2303.09190

PDF

https://arxiv.org/pdf/2303.09190.pdf