Abstract
Future frame prediction has been approached through two primary methods: autoregressive and non-autoregressive. Autoregressive methods rely on the Markov assumption and can achieve high accuracy in the early stages of prediction when errors are not yet accumulated. However, their performance tends to decline as the number of time steps increases. In contrast, non-autoregressive methods can achieve relatively high performance but lack correlation between predictions for each time step. In this paper, we propose an Implicit Stacked Autoregressive Model for Video Prediction (IAM4VP), which is an implicit video prediction model that applies a stacked autoregressive method. Like non-autoregressive methods, stacked autoregressive methods use the same observed frame to estimate all future frames. However, they use their own predictions as input, similar to autoregressive methods. As the number of time steps increases, predictions are sequentially stacked in the queue. To evaluate the effectiveness of IAM4VP, we conducted experiments on three common future frame prediction benchmark datasets and weather\&climate prediction benchmark datasets. The results demonstrate that our proposed model achieves state-of-the-art performance.
Abstract (translated)
未来的帧预测已经通过两个主要方法:自回归和非线性非自回归方法。自回归方法依赖于马尔可夫假设,并在预测的早期阶段,当错误尚未累积时,可以实现高精度。然而,他们的性能随着时间步数的增加而倾向于下降。相比之下,非线性非自回归方法可以实现较高的性能,但它们在每个时间步之间的预测之间缺乏相关性。在本文中,我们提出了一种隐含的堆叠自回归模型,即视频预测隐含的堆叠自回归模型(IAM4VP),这是一种应用堆叠自回归方法的视频预测模型。与非线性非自回归方法一样,堆叠自回归方法使用相同的观察帧来估计所有未来的帧。然而,它们使用自己的预测作为输入,类似于自回归方法。随着时间步数的增加,预测依次堆叠在队列中。为了评估IAM4VP的有效性,我们进行了三种常见的未来帧预测基准数据和天气\&气候预测基准数据的实验。结果表明,我们提出的模型实现了先进的性能。
URL
https://arxiv.org/abs/2303.07849