Abstract
The advent of personalized content generation by LLMs presents a novel challenge: how to efficiently adapt text to meet individual preferences without the unsustainable demand of creating a unique model for each user. This study introduces an innovative online method that employs neural bandit algorithms to dynamically optimize soft instruction embeddings based on user feedback, enhancing the personalization of open-ended text generation by white-box LLMs. Through rigorous experimentation on various tasks, we demonstrate significant performance improvements over baseline strategies. NeuralTS, in particular, leads to substantial enhancements in personalized news headline generation, achieving up to a 62.9% improvement in terms of best ROUGE scores and up to 2.76% increase in LLM-agent evaluation against the baseline.
Abstract (translated)
个性化内容生成由LLM的问世带来了一个新的挑战:如何高效地将文本适应于满足个人偏好,而不会产生每个用户都要求创建独特模型的不可持续需求。本研究介绍了一种创新的方法,该方法采用神经随机游走算法动态优化基于用户反馈的软指令嵌入,从而增强LLM在开放性文本生成中的个性化。通过在各种任务上进行严谨的实验,我们证明了与基线策略相比,具有显著的性能提升。特别是,NeuralTS在个性化新闻标题生成方面取得了很大的提升,最佳ROUGE得分提高了62.9%,LLM代理评估基准测试中的评估值增加了2.76%。
URL
https://arxiv.org/abs/2404.16115