Abstract
Memes are the new-age conveyance mechanism for humor on social media sites. Memes often include an image and some text. Memes can be used to promote disinformation or hatred, thus it is crucial to investigate in details. We introduce Memotion 3, a new dataset with 10,000 annotated memes. Unlike other prevalent datasets in the domain, including prior iterations of Memotion, Memotion 3 introduces Hindi-English Codemixed memes while prior works in the area were limited to only the English memes. We describe the Memotion task, the data collection and the dataset creation methodologies. We also provide a baseline for the task. The baseline code and dataset will be made available at this https URL
Abstract (translated)
弹幕是社交媒体平台上的新型娱乐传递机制,通常包括一张图片和一些文本。弹幕可以用来宣传虚假信息或仇恨,因此深入研究至关重要。我们介绍了 Memotion 3,一个包含10,000个注释弹幕的新数据集。与该领域其他流行的数据集(包括 Memotion 的先前版本)不同,Memotion 3引入了希伯来语-英语代码混合弹幕,而先前该地区的工作仅限于英语弹幕。我们描述了 Memotion 任务的数据收集和数据集创建方法。我们还提供了任务的基线代码和数据集。基线代码和数据集将在这个 https URL 上可用。
URL
https://arxiv.org/abs/2303.09892