Abstract
Shot boundary detection (SBD) is an important first step in many video processing applications. This paper presents a simple modular convolutional neural network architecture that achieves state-of-the-art results on the RAI dataset with well above real-time inference speed even on a single mediocre GPU. The network employs dilated convolutions and operates just on small resized frames. The training process employed randomly generated transitions using selected shots from the TRECVID IACC.3 dataset. The code and a selected trained network will be available at this https URL.
Abstract (translated)
镜头边界检测(SBD)是许多视频处理应用中的重要第一步。本文提出了一种简单的模块化卷积神经网络结构,该结构在RAI数据集上实现了最先进的结果,即使在单一的普通GPU上,也具有很高的实时推理速度。该网络采用扩张卷积,仅在小尺寸帧上运行。训练过程采用随机生成的转换,使用从trecvid iacc.3数据集中选择的镜头。代码和选定的经过培训的网络将在此HTTPS URL上可用。
URL
https://arxiv.org/abs/1906.03363