Fast Exact Retrieval for Nearest-neighbor Lookup

Abstract
Abstract (translated)
URL
PDF

Abstract

Exact nearest neighbor search is a computationally intensive process, and even its simpler sibling -- vector retrieval -- can be computationally complex. This is exacerbated when retrieving vectors which have high-dimension $d$ relative to the number of vectors, $N$, in the database. Exact nearest neighbor retrieval has been generally acknowledged to be a $O(Nd)$ problem with no sub-linear solutions. Attention has instead shifted towards Approximate Nearest-Neighbor (ANN) retrieval techniques, many of which have sub-linear or even logarithmic time complexities. However, if our intuition from binary search problems (e.g. $d=1$ vector retrieval) carries, there ought to be a way to retrieve an organized representation of vectors without brute-forcing our way to a solution. For low dimension (e.g. $d=2$ or $d=3$ cases), \texttt{kd-trees} provide a $O(d\log N)$ algorithm for retrieval. Unfortunately the algorithm deteriorates rapidly to a $O(dN)$ solution at high dimensions (e.g. $k=128$), in practice. We propose a novel algorithm for logarithmic Fast Exact Retrieval for Nearest-neighbor lookup (FERN), inspired by \texttt{kd-trees}. The algorithm achieves $O(d\log N)$ look-up with 100\% recall on 10 million $d=128$ uniformly randomly generated vectors.\footnote{Code available at this https URL}

Abstract (translated)

精确近邻搜索是一个计算密集型过程，甚至连其简单的兄弟——向量检索——也可能具有计算复杂性。当检索具有高维$d$与数据库中向量数量$N$相对时，这种情况尤为加剧。精确近邻检索通常被认为是$O(Nd)$问题，没有sub-线性解决方案。相反，注意力转向了向量检索的近似近邻（ANN）技术，其中许多具有sub-linear或甚至对数时间复杂性。然而，从二进制搜索问题（例如，$d=1$向量检索）的直觉告诉我们，应该有一种不需要暴力搜索解决方案的方法来检索向量的组织表示。对于低维度（例如，$d=2$或$d=3$的情况），\texttt{kd-trees}提供了一个$O(d\log N)$的检索算法。然而，在高层维度（例如，$k=128$）上，该算法迅速退化为一$O(dN)$的解决方案，在实际应用中。我们提出了一个新颖的logarithmic Fast Exact Retrieval for Nearest-neighbor lookup (FERN)算法，灵感来自\texttt{kd-trees}。该算法在100\%召回率的情况下，具有$O(d\log N)$的查找速度，对128个均匀随机生成的$d=128$维向量进行检索。附录中的代码可在该https URL上找到。

URL

https://arxiv.org/abs/2405.04435

PDF

https://arxiv.org/pdf/2405.04435.pdf

Fast Exact Retrieval for Nearest-neighbor Lookup

Abstract

Abstract (translated)

URL

PDF Copy

PDF