Abstract
Image Coding for Machines (ICM) is an image compression technique for image recognition. This technique is essential due to the growing demand for image recognition AI. In this paper, we propose a method for ICM that focuses on encoding and decoding only the edge information of object parts in an image, which we call SA-ICM. This is an Learned Image Compression (LIC) model trained using edge information created by Segment Anything. Our method can be used for image recognition models with various tasks. SA-ICM is also robust to changes in input data, making it effective for a variety of use cases. Additionally, our method provides benefits from a privacy point of view, as it removes human facial information on the encoder's side, thus protecting one's privacy. Furthermore, this LIC model training method can be used to train Neural Representations for Videos (NeRV), which is a video compression model. By training NeRV using edge information created by Segment Anything, it is possible to create a NeRV that is effective for image recognition (SA-NeRV). Experimental results confirm the advantages of SA-ICM, presenting the best performance in image compression for image recognition. We also show that SA-NeRV is superior to ordinary NeRV in video compression for machines.
Abstract (translated)
图像编码(ICM)是一种图像压缩技术,用于图像识别。由于图像识别人工智能(AI)的需求不断增长,ICM技术在图像识别中具有重要作用。在本文中,我们提出了一个专注于对图像中物体部分边缘信息的编码和解码的方法,我们称之为SA-ICM。这是我们使用Segment Anything生成的边缘信息训练的学习图像压缩(LIC)模型。我们的方法可以应用于各种图像识别任务模型。SA-ICM对输入数据的变化也非常鲁棒,因此适用于各种用例。此外,从隐私角度来看,我们的方法移除了编码器侧的人脸信息,从而保护个人隐私。此外,通过使用Segment Anything生成的边缘信息训练LIC模型,我们还可以用于训练Neural Representations for Videos(NeRV),这是一种视频压缩模型。通过训练NeRV使用Segment Anything生成的边缘信息,可以创建一个有效的NeRV用于图像识别(SA-NeRV)。实验结果证实了SA-ICM的优越性,在图像压缩方面取得了最佳性能。我们还证明了SA-NeRV在机器对视频压缩方面的优势。
URL
https://arxiv.org/abs/2403.04173