Nearest-Neighbor Search in Real Time

yotamhc
Jun 1, 2015
1 min read

Similarity search, and specifically the nearest-neighbor search (NN) problem is widely used in many fields of computer science such as machine learning, computer vision and databases. However, in many settings such searches are known to suffer from the notorious curse of dimensionality, where running time grows exponentially with the dimension d. This causes severe performance degradation when working in high-dimensional spaces. Approximate techniques such as locality-sensitive hashing improve the performance of the search, but are still computationally intensive. In this work we propose a new way to solve this problem using a special hardware device called ternary content addressable memory (TCAM). TCAM is an associative memory, which is a special type of computer memory that is widely used in switches and routers for very high speed search applications. We show that the TCAM computational model can be leveraged and adjusted to solve NN search problems in a single TCAM lookup cycle (a few nanoseconds), and with linear space. This concept does not suffer from the curse of dimensionality and is shown to improve the best known approaches for NN by more than four orders of magnitude. Simulation results demonstrate dramatic improvement over the best known approaches for NN, and suggest that TCAM devices may play a critical role in future large-scale databases and cloud applications.

This paper was presented in ACM DaMoN 2015, Melbourne, VIC, Australia.

A full version of the paper, which also shows applications of the encoding scheme in the area of networking and packet classification, has been accepted to ACM SPAA 2016, to be held on July 2016.

Yotam Harchol

Nearest-Neighbor Search in Real Time

Comments