Paper Reading AI Learner

Abutting Grating Illusion: Cognitive Challenge to Neural Network Models

2022-08-08 08:01:11
Jinyu Fan, Yi Zeng

Abstract

Even the state-of-the-art deep learning models lack fundamental abilities compared to humans. Multiple comparison paradigms have been proposed to explore the distinctions between humans and deep learning. While most comparisons rely on corruptions inspired by mathematical transformations, very few have bases on human cognitive phenomena. In this study, we propose a novel corruption method based on the abutting grating illusion, which is a visual phenomenon widely discovered in both human and a wide range of animal species. The corruption method destroys the gradient-defined boundaries and generates the perception of illusory contours using line gratings abutting each other. We applied the method on MNIST, high resolution MNIST, and silhouette object images. Various deep learning models are tested on the corruption, including models trained from scratch and 109 models pretrained with ImageNet or various data augmentation techniques. Our results show that abutting grating corruption is challenging even for state-of-the-art deep learning models because most models are randomly guessing. We also discovered that the DeepAugment technique can greatly improve robustness against abutting grating illusion. Visualisation of early layers indicates that better performing models exhibit stronger end-stopping property, which is consistent with neuroscience discoveries. To validate the corruption method, 24 human subjects are involved to classify samples of corrupted datasets.

Abstract (translated)

URL

https://arxiv.org/abs/2208.03958

PDF

https://arxiv.org/pdf/2208.03958.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot