TimeGate: Conditional Gating of Segments in Long-range Activities

2020-04-03 23:14:35

Noureldien Hussein, Mihir Jain, Babak Ehteshami Bejnordi

arXiv_CV

Abstract
Abstract (translated)
URL
PDF

Abstract

When recognizing a long-range activity, exploring the entire video is exhaustive and computationally expensive, as it can span up to a few minutes. Thus, it is of great importance to sample only the salient parts of the video. We propose TimeGate, along with a novel conditional gating module, for sampling the most representative segments from the long-range activity. TimeGate has two novelties that address the shortcomings of previous sampling methods, as SCSampler. First, it enables a differentiable sampling of segments. Thus, TimeGate can be fitted with modern CNNs and trained end-to-end as a single and unified model.Second, the sampling is conditioned on both the segments and their context. Consequently, TimeGate is better suited for long-range activities, where the importance of a segment heavily depends on the video context.TimeGate reduces the computation of existing CNNs on three benchmarks for long-range activities: Charades, Breakfast and MultiThumos. In particular, TimeGate reduces the computation of I3D by 50% while maintaining the classification accuracy.

Abstract (translated)

URL

https://arxiv.org/abs/2004.01808

PDF

https://arxiv.org/pdf/2004.01808.pdf