Paper Reading AI Learner

Fully Quantized Always-on Face Detector Considering Mobile Image Sensors

2023-11-02 05:35:49
Haechang Lee, Wongi Jeong, Dongil Ryu, Hyunwoo Je, Albert No, Kijeong Kim, Se Young Chun

Abstract

Despite significant research on lightweight deep neural networks (DNNs) designed for edge devices, the current face detectors do not fully meet the requirements for "intelligent" CMOS image sensors (iCISs) integrated with embedded DNNs. These sensors are essential in various practical applications, such as energy-efficient mobile phones and surveillance systems with always-on capabilities. One noteworthy limitation is the absence of suitable face detectors for the always-on scenario, a crucial aspect of image sensor-level applications. These detectors must operate directly with sensor RAW data before the image signal processor (ISP) takes over. This gap poses a significant challenge in achieving optimal performance in such scenarios. Further research and development are necessary to bridge this gap and fully leverage the potential of iCIS applications. In this study, we aim to bridge the gap by exploring extremely low-bit lightweight face detectors, focusing on the always-on face detection scenario for mobile image sensor applications. To achieve this, our proposed model utilizes sensor-aware synthetic RAW inputs, simulating always-on face detection processed "before" the ISP chain. Our approach employs ternary (-1, 0, 1) weights for potential implementations in image sensors, resulting in a relatively simple network architecture with shallow layers and extremely low-bitwidth. Our method demonstrates reasonable face detection performance and excellent efficiency in simulation studies, offering promising possibilities for practical always-on face detectors in real-world applications.

Abstract (translated)

尽管在轻型边缘设备上进行了大量关于为边缘设备设计的轻量级深度神经网络(DNNs)的研究,但当前的 face 检测器并没有完全满足集成嵌入式 DNNs 的“智能”CMOS图像传感器(iCIS)的要求。这些传感器在各种实际应用中非常重要,如高效的移动电话和具有持续开启功能的安防系统。一个值得注意的是,在持续开启的场景下缺乏适合的 face 检测器,这是图像传感器级别应用的关键方面。这些检测器必须在图像信号处理器(ISP)接管之前直接与传感器RAW数据操作。这一空白对在此类场景实现最佳性能提出了重大挑战。进一步的研究和开发是必要的,以弥合这一空白并充分利用iCIS应用的潜力。在本研究中,我们旨在通过探索极端轻量级的 face 检测器来弥合这一空白,重点关注移动图像传感器应用的持续开启场景。为了实现这一目标,我们提出的模型利用了传感器感知的合成RAW输入,在ISP链之前对持续开启的 face 检测进行模拟。我们采用二进制(-1,0,1)权重来设计成像传感器上的实现,导致网络架构相对简单,具有极低的位宽。我们的方法在模拟研究中的面部检测性能和效率都表现出相当不错的水平,为现实世界中的持续开启面部检测器提供了有前途的解决方案。

URL

https://arxiv.org/abs/2311.01001

PDF

https://arxiv.org/pdf/2311.01001.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot