Abstract
In Actor and Observer we introduced a dataset linking the first and third-person video understanding domains, the Charades-Ego Dataset. In this paper we describe the egocentric aspect of the dataset and present annotations for Charades-Ego with 68,536 activity instances in 68.8 hours of first and third-person video, making it one of the largest and most diverse egocentric datasets available. Charades-Ego furthermore shares activity classes, scripts, and methodology with the Charades dataset, that consist of additional 82.3 hours of third-person video with 66,500 activity instances. Charades-Ego has temporal annotations and textual descriptions, making it suitable for egocentric video classification, localization, captioning, and new tasks utilizing the cross-modal nature of the data.
Abstract (translated)
在Actor和Observer中,我们引入了一个连接第一人和第三人视频理解域的数据集Charades-Ego Dataset。在本文中,我们描述了数据集的以自我为中心的方面,并在第一和第三人视频的68.8小时内为Charades-Ego提供了68,536个活动实例的注释,使其成为可用的最大和最多样化的以自我为中心的数据集之一。 Charades-Ego还与Charades数据集共享活动类别,脚本和方法,其中包括82.3小时的第三方视频和66,500个活动实例。 Charades-Ego具有时间注释和文本描述,适合以自我为中心的视频分类,本地化,字幕以及利用数据的跨模式性质的新任务。
URL
https://arxiv.org/abs/1804.09626