Paper Reading AI Learner

ATOM: Attention Mixer for Efficient Dataset Distillation

2024-05-02 15:15:01
Samir Khaki, Ahmad Sajedi, Kai Wang, Lucy Z. Liu, Yuri A. Lawryshyn, Konstantinos N. Plataniotis

Abstract

Recent works in dataset distillation seek to minimize training expenses by generating a condensed synthetic dataset that encapsulates the information present in a larger real dataset. These approaches ultimately aim to attain test accuracy levels akin to those achieved by models trained on the entirety of the original dataset. Previous studies in feature and distribution matching have achieved significant results without incurring the costs of bi-level optimization in the distillation process. Despite their convincing efficiency, many of these methods suffer from marginal downstream performance improvements, limited distillation of contextual information, and subpar cross-architecture generalization. To address these challenges in dataset distillation, we propose the ATtentiOn Mixer (ATOM) module to efficiently distill large datasets using a mixture of channel and spatial-wise attention in the feature matching process. Spatial-wise attention helps guide the learning process based on consistent localization of classes in their respective images, allowing for distillation from a broader receptive field. Meanwhile, channel-wise attention captures the contextual information associated with the class itself, thus making the synthetic image more informative for training. By integrating both types of attention, our ATOM module demonstrates superior performance across various computer vision datasets, including CIFAR10/100 and TinyImagenet. Notably, our method significantly improves performance in scenarios with a low number of images per class, thereby enhancing its potential. Furthermore, we maintain the improvement in cross-architectures and applications such as neural architecture search.

Abstract (translated)

近年来在数据蒸馏领域的研究旨在通过生成一个压缩合成数据集来最小化训练成本,该数据集包含了较大真实数据集中的信息。这些方法最终旨在实现与整个原始数据集训练出的模型具有相似的测试准确度。之前在特征匹配和分布匹配方面的研究表明,在蒸馏过程中没有产生双层优化成本,但取得了显著的成果。尽管这些方法在节省训练成本方面具有令人满意的效率,但它们在下游性能方面存在微小的改进,对上下文信息的提取有限,并且模型扩展性较差。为了应对这些挑战,我们在数据蒸馏领域提出了ATtentiOn Mixer(ATOM)模块,利用混合通道和空间注意在特征匹配过程中高效地蒸馏大型数据集。空间注意可以帮助根据类在各自图像上的一致定位来指导学习过程,实现从更广泛的感受野进行蒸馏。同时,通道注意可以捕捉类本身相关的上下文信息,从而使合成图像对训练更加有用。通过整合这两种注意,我们的ATOM模块在各种计算机视觉数据集上的表现都超过了之前的水平,包括CIFAR10/100和TinyImagenet。值得注意的是,我们的方法在图像数量较低的情况下显著提高了性能,从而增强了其潜力。此外,我们还保持了在神经架构搜索等方面的改进。

URL

https://arxiv.org/abs/2405.01373

PDF

https://arxiv.org/pdf/2405.01373.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot