Abstract
Image labeling is a critical bottleneck in the development of computer vision technologies, often constraining the potential of machine learning models due to the time-intensive nature of manual annotations. This work introduces a novel approach that leverages outpainting to address the problem of annotated data scarcity by generating artificial contexts and annotations, significantly reducing manual labeling efforts. We apply this technique to a particularly acute challenge in autonomous driving, urban planning, and environmental monitoring: the lack of diverse, eye-level vehicle images in desired classes. Our dataset comprises AI-generated vehicle images obtained by detecting and cropping vehicles from manually selected seed images, which are then outpainted onto larger canvases to simulate varied real-world conditions. The outpainted images include detailed annotations, providing high-quality ground truth data. Advanced outpainting techniques and image quality assessments ensure visual fidelity and contextual relevance. Augmentation with outpainted vehicles improves overall performance metrics by up to 8\% and enhances prediction of underrepresented classes by up to 20\%. This approach, exemplifying outpainting as a self-annotating paradigm, presents a solution that enhances dataset versatility across multiple domains of machine learning. The code and links to datasets used in this study are available for further research and replication at this https URL.
Abstract (translated)
图像标注是计算机视觉技术发展中的一个关键瓶颈,由于手动注释的耗时性质,常常限制了机器学习模型的潜力。本研究介绍了一种新颖的方法,利用扩边技术解决标注数据稀缺的问题,通过生成人工背景和标注,显著减少了手动标注工作量。我们将这一技术应用于自动驾驶、城市规划及环境监测中的一个特别严峻挑战:缺乏所需类别中多样化且符合人眼高度的车辆图像。我们的数据集由AI生成的车辆图像组成,这些图像是通过对手动选择的种子图片进行检测和裁剪获得的,并将它们扩边到更大的画布上以模拟多样的现实条件。扩边后的图像包含了详细的标注,提供了高质量的真实数据。先进的扩边技术和图像质量评估确保了视觉保真度和情境相关性。通过加入扩边车辆进行增强,整体性能指标提高了高达8%,并提升了对代表性不足类别预测的准确性,最多提高20%。这种方法展示了扩边作为一种自我标注范式的解决方案,在机器学习的多个领域中增强了数据集的灵活性。本研究使用的代码和数据集链接可在以下网址获取:[此 HTTPS URL]。
URL
https://arxiv.org/abs/2410.24116