home · mobile · calendar · defenses · 1999-2000 · 

Thesis Defense - Wang

Nearest Neighbor Search in High Dimensional Euclidean Space
Ning Wang
Computer Science PhD Candidate

The nearest neighbor search in d-dimensional space is one of the fundamental problems in computational geometry, with many important applications. The problem has been studied extensively in the fields of data structure and algorithm design as well as computational geometry in order to create efficient algorithm. Unfortunately the usefulness of all existing algorithms are severely impaired by either their preprocessing cost or search performance degradation with increase of dimensionality. On the theoretical aspect, the best known bound for the problem is polylog search time with enormous nd/2 ⌋ storage requirement. On the other hand, even the most practically efficient k-d tree algorithm suffers from the dimension curse. The constant with exponential growth with dimension makes the algorithm of little use for applications with moderately high dimensionality (16 < d < 20).

The thesis studies the nearest neighbor search problem in a high dimensional setting with a focus on the development of new practical algorithms. We present two algorithms which effectively solve the search problem in high dimensional Euclidean space. The first algorithm is a variant of k-d tree algorithm. With our refinement, this algorithm significantly outperforms the standard k-d tree in high dimensional space with various kinds of data distributions. The second algorithm is based on a new space elimination strategy that uses the projection of the data set to selected high dimensional lines. It explores both the distance and directional relationship between the data set and the query point in the process of pruning the search tree. The algorithm has a linear space complexity and a sublinear query time complexity in average case.

To validate the usefulness of the new algorithms, we applied these algorithm to a 3d appearance recognition program. The experiment results are presented.

Committee: Oliver McBryan, Professor (Chair)
Harold (Hal) Gabow, Professor
Dirk Grunwald, Associate Professor
Xiao-Chuan Cai, Associate Professor
Paul Swarztrauber, Adjoint Professor
Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
May 5, 2012 (14:20)