Paper Reading AI Learner

RAGViz: Diagnose and Visualize Retrieval-Augmented Generation

2024-11-04 02:30:05
Tevin Wang, Jingyuan He, Chenyan Xiong

Abstract

Retrieval-augmented generation (RAG) combines knowledge from domain-specific sources into large language models to ground answer generation. Current RAG systems lack customizable visibility on the context documents and the model's attentiveness towards such documents. We propose RAGViz, a RAG diagnosis tool that visualizes the attentiveness of the generated tokens in retrieved documents. With a built-in user interface, retrieval index, and Large Language Model (LLM) backbone, RAGViz provides two main functionalities: (1) token and document-level attention visualization, and (2) generation comparison upon context document addition and removal. As an open-source toolkit, RAGViz can be easily hosted with a custom embedding model and HuggingFace-supported LLM backbone. Using a hybrid ANN (Approximate Nearest Neighbor) index, memory-efficient LLM inference tool, and custom context snippet method, RAGViz operates efficiently with a median query time of about 5 seconds on a moderate GPU node. Our code is available at this https URL. A demo video of RAGViz can be found at this https URL.

Abstract (translated)

检索增强生成(RAG)将特定领域的知识融入大型语言模型中,以使答案的生成更加可靠。当前的RAG系统缺乏对上下文文档及模型对此类文档关注程度的自定义可见性。我们提出了RAGViz,这是一个用于诊断RAG的工具,能够可视化生成标记在检索到的文档中的关注度。通过内置用户界面、检索索引和大型语言模型(LLM)核心,RAGViz提供了两大主要功能:(1) 标记级和文档级注意力可视化;(2) 上下文文档添加与移除后的生成对比。作为一个开源工具包,RAGViz可以使用自定义嵌入模型和HuggingFace支持的LLM核心轻松托管。通过混合ANN(近似最近邻)索引、内存高效的LLM推理工具以及自定义上下文片段方法,RAGViz在中等GPU节点上的查询时间中值约为5秒,操作高效。我们的代码可在以下链接获取:[此 HTTPS URL]。RAGViz的演示视频可以在以下链接找到:[此 HTTPS URL]。

URL

https://arxiv.org/abs/2411.01751

PDF

https://arxiv.org/pdf/2411.01751.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot