Paper Reading AI Learner

Template Matching in Images using Segmented Normalized Cross-Correlation

2025-02-03 11:58:33
Davor Maru\v{s}i\'c, Sini\v{s}a Popovi\'c, Zoran Kalafati\'c

Abstract

In this paper, a new variant of an algorithm for normalized cross-correlation (NCC) is proposed in the context of template matching in images. The proposed algorithm is based on the precomputation of a template image approximation, enabling more efficient calculation of approximate NCC with the source image than using the original template for exact NCC calculation. The approximate template is precomputed from the template image by a split-and-merge approach, resulting in a decomposition to axis-aligned rectangular segments, whose sizes depend on per-segment pixel intensity variance. In the approximate template, each segment is assigned the mean grayscale value of the corresponding pixels from the original template. The proposed algorithm achieves superior computational performance with negligible NCC approximation errors compared to the well-known Fast Fourier Transform (FFT)-based NCC algorithm, when applied on less visually complex and/or smaller template images. In other cases, the proposed algorithm can maintain either computational performance or NCC approximation error within the range of the FFT-based algorithm, but not both.

Abstract (translated)

本文提出了一种新的算法变体,用于图像模板匹配中的归一化互相关(NCC)计算。所提出的算法基于对模板图像的预近似计算,使得与使用原始模板进行精确NCC计算相比,可以更高效地与源图像进行近似NCC计算。通过分裂和合并的方法从模板图像中预先计算出近似的模板,在此过程中将其分解为轴对齐的矩形段,这些段的大小取决于每个片段内像素强度方差。在近似模板中,每个分段被分配其对应于原始模板中相应像素的平均灰度值。 所提出的算法在应用于视觉复杂程度较低和/或较小的模板图像时,与著名的基于快速傅立叶变换(FFT)的NCC算法相比,在计算性能上表现出更优的表现,并且几乎不会产生NCC近似误差。而在其他情况下,所提出的方法可以在计算性能或者NCC近似误差中保持在一个范围之内,但无法同时在两者之间实现平衡。

URL

https://arxiv.org/abs/2502.01286

PDF

https://arxiv.org/pdf/2502.01286.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot