Paper Reading AI Learner

CollabStory: Multi-LLM Collaborative Story Generation and Authorship Analysis

2024-06-18 14:35:12
Saranya Venkatraman, Nafis Irtiza Tripto, Dongwon Lee

Abstract

The rise of unifying frameworks that enable seamless interoperability of Large Language Models (LLMs) has made LLM-LLM collaboration for open-ended tasks a possibility. Despite this, there have not been efforts to explore such collaborative writing. We take the next step beyond human-LLM collaboration to explore this multi-LLM scenario by generating the first exclusively LLM-generated collaborative stories dataset called CollabStory. We focus on single-author ($N=1$) to multi-author (up to $N=5$) scenarios, where multiple LLMs co-author stories. We generate over 32k stories using open-source instruction-tuned LLMs. Further, we take inspiration from the PAN tasks that have set the standard for human-human multi-author writing tasks and analysis. We extend their authorship-related tasks for multi-LLM settings and present baselines for LLM-LLM collaboration. We find that current baselines are not able to handle this emerging scenario. Thus, CollabStory is a resource that could help propel an understanding as well as the development of techniques to discern the use of multiple LLMs. This is crucial to study in the context of writing tasks since LLM-LLM collaboration could potentially overwhelm ongoing challenges related to plagiarism detection, credit assignment, maintaining academic integrity in educational settings, and addressing copyright infringement concerns. We make our dataset and code available at \texttt{\url{this https URL}}.

Abstract (translated)

统一框架的出现使得大型语言模型(LLMs)之间的无缝互操作成为可能,这也使得为开放性任务进行LLM-LLM合作成为一个可能。尽管如此,还没有尝试过探索这种合作写作。我们将在人类-LLM合作之外,通过生成第一个仅由LLM生成的合作故事数据集CollabStory,进一步研究这个多LLM场景。我们关注单作者($N=1$)到多作者(最多$N=5$)情景,其中多个LLM合作创作故事。我们使用开源指令调整的LLM生成超过32k个故事。此外,我们受到了PAN任务为人类-人类多作者写作任务和分析设定的标准所启发。我们为多LLM设置扩展了作者相关任务,并提出了LLM-LLM合作的基线。我们发现,现有的基线无法处理这种新兴情景。因此,CollabStory是一个资源,可以帮助推动对这种新兴情景的理解,以及为LLM-LLM合作技术的发展提供帮助。这是在写作任务背景下研究的重要内容,因为LLM-LLM合作可能导致对抄袭检测、责任分配、教育场所中保持学术诚信和应对版权侵权问题的持续挑战。我们将把数据集和代码发布在\texttt{\url{这个 https URL}}。

URL

https://arxiv.org/abs/2406.12665

PDF

https://arxiv.org/pdf/2406.12665.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot