Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos

Abstract
Abstract (translated)
URL
PDF

Abstract

In Actor and Observer we introduced a dataset linking the first and third-person video understanding domains, the Charades-Ego Dataset. In this paper we describe the egocentric aspect of the dataset and present annotations for Charades-Ego with 68,536 activity instances in 68.8 hours of first and third-person video, making it one of the largest and most diverse egocentric datasets available. Charades-Ego furthermore shares activity classes, scripts, and methodology with the Charades dataset, that consist of additional 82.3 hours of third-person video with 66,500 activity instances. Charades-Ego has temporal annotations and textual descriptions, making it suitable for egocentric video classification, localization, captioning, and new tasks utilizing the cross-modal nature of the data.

Abstract (translated)

在Actor和Observer中，我们引入了一个连接第一人和第三人视频理解域的数据集Charades-Ego Dataset。在本文中，我们描述了数据集的以自我为中心的方面，并在第一和第三人视频的68.8小时内为Charades-Ego提供了68,536个活动实例的注释，使其成为可用的最大和最多样化的以自我为中心的数据集之一。 Charades-Ego还与Charades数据集共享活动类别，脚本和方法，其中包括82.3小时的第三方视频和66,500个活动实例。 Charades-Ego具有时间注释和文本描述，适合以自我为中心的视频分类，本地化，字幕以及利用数据的跨模式性质的新任务。

URL

https://arxiv.org/abs/1804.09626

PDF

https://arxiv.org/pdf/1804.09626.pdf