Paper Reading AI Learner

It's Not Just Labeling' -- A Research on LLM Generated Feedback Interpretability and Image Labeling Sketch Features

2025-05-26 02:13:52
Baichuan Li, Larry Powell, Tracy Hammond

Abstract

The quality of training data is critical to the performance of machine learning applications in domains like transportation, healthcare, and robotics. Accurate image labeling, however, often relies on time-consuming, expert-driven methods with limited feedback. This research introduces a sketch-based annotation approach supported by large language models (LLMs) to reduce technical barriers and enhance accessibility. Using a synthetic dataset, we examine how sketch recognition features relate to LLM feedback metrics, aiming to improve the reliability and interpretability of LLM-assisted labeling. We also explore how prompting strategies and sketch variations influence feedback quality. Our main contribution is a sketch-based virtual assistant that simplifies annotation for non-experts and advances LLM-driven labeling tools in terms of scalability, accessibility, and explainability.

Abstract (translated)

训练数据的质量对于交通运输、医疗保健和机器人技术等领域的机器学习应用性能至关重要。然而,准确的图像标注通常依赖于耗时且需要专业知识的方法,并且反馈有限。这项研究引入了一种基于草图注释的方法,该方法得到了大型语言模型(LLM)的支持,以降低技术壁垒并提高可访问性。通过使用合成数据集,我们考察了草图识别特征与LLM反馈指标之间的关系,旨在提升LLM辅助标注的可靠性和解释性。此外,我们还探讨了提示策略和草图变化对反馈质量的影响。 我们的主要贡献是一款基于草图的虚拟助手,它简化了非专业人士的注释过程,并在可扩展性、易用性和可解释性方面推进了LLM驱动的标记工具的发展。

URL

https://arxiv.org/abs/2505.19419

PDF

https://arxiv.org/pdf/2505.19419.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot