Abstract
In this paper, we propose a recommender system that takes a user-selected story as input and returns a ranked list of similar stories on the basis of shared literary themes. The user of our system first selects a story of interest from a list of background stories, and then sets, as desired, a handful of knowledge-based filtering options, including the similarity measure used to quantify the similarity between story pairs. As a proof of concept, we validate experimentally our system on a dataset comprising 452 manually themed Star Trek television franchise episodes by using a benchmark of curated sets of related stories. We show that our manual approach to theme assignment significantly outperforms an automated approach to theme identification based on the application of topic models to episode transcripts. Additionally, we compare different approaches based on sets and on a hierarchical-semantic organization of themes to construct similarity functions between stories. The recommender system is implemented in the R package stoRy. A related R Shiny web application is available publicly along with the Stark Trek dataset including the theme ontology, episode annotations, storyset benchmarks, transcripts, and evaluation setup.
Abstract (translated)
在本文中,我们提出了一个推荐系统,该系统将用户选择的故事作为输入,并在共享的文学主题的基础上返回类似故事的排序列表。我们系统的用户首先从背景故事列表中选择感兴趣的故事,然后根据需要设置一些基于知识的过滤选项,包括用于量化故事对之间相似性的相似性度量。作为概念验证,我们通过使用策划的相关故事集的基准,在实验上验证我们的系统在包含452个手动主题的星际迷航电视特许经营剧集的数据集上。我们表明,我们的主题分配手动方法明显优于基于主题模型应用于剧集成绩单的主题识别的自动化方法。此外,我们比较了基于集合的不同方法和主题的层次 - 语义组织,以构建故事之间的相似性函数。推荐系统在R包stoRy中实现。相关的R Shiny Web应用程序与Stark Trek数据集一起公开发布,包括主题本体,剧集注释,故事集基准,成绩单和评估设置。
URL
https://arxiv.org/abs/1808.00103