Paper Reading AI Learner

Toward an Over-parameterized Direct-Fit Model of Visual Perception

2022-10-07 23:54:30
Xin Li

Abstract

In this paper, we revisit the problem of computational modeling of simple and complex cells for an over-parameterized and direct-fit model of visual perception. Unlike conventional wisdom, we highlight the difference in parallel and sequential binding mechanisms between simple and complex cells. A new proposal for abstracting them into space partitioning and composition is developed as the foundation of our new hierarchical construction. Our construction can be interpreted as a product topology-based generalization of the existing k-d tree, making it suitable for brute-force direct-fit in a high-dimensional space. The constructed model has been applied to several classical experiments in neuroscience and psychology. We provide an anti-sparse coding interpretation of the constructed vision model and show how it leads to a dynamic programming (DP)-like approximate nearest-neighbor search based on $\ell_{\infty}$-optimization. We also briefly discuss two possible implementations based on asymmetrical (decoder matters more) auto-encoder and spiking neural networks (SNN), respectively.

Abstract (translated)

URL

https://arxiv.org/abs/2210.03850

PDF

https://arxiv.org/pdf/2210.03850.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot