TY - JOUR

T1 - Reporting neighbors in high-dimensional euclidean space?

AU - Aiger, Dror

AU - Kaplan, Haim

AU - Sharir, Micha

PY - 2014

Y1 - 2014

N2 - We consider the following problem, which arises in many database and web-based applications: Given a set P of n points in a high-dimensional space Rd and a distance r , we want to report all pairs of points of P at Euclidean distance at most r . We present two randomized algorithms, one based on randomly shifted grids, and the other on randomly shifted and rotated grids. The running time of both algorithms is of the form C (d)(n + k ) log n, where k is the output size and C (d) is a constant that depends on the dimension d. The log n factor is needed to guarantee, with high probability, that all neighbor pairs are reported and can be dropped if it suffices to report, in expectation, an ? d )d for arbitrarily large fraction of the pairs. When only translations are used, C (d) is of the form(a some (small) absolute constant a ? 0. 484; this bound is worst-case tight, up to an exponential factor of about 2d . When both rotations and translations are used, C (d) can be improved to roughly 6.74d, getting rid of the superexponential factor ? d d . When the input set (lies in a subset of d-space that) has low doubling dimension ö , the performance of the first algorithm ? improves to C (d, ö )(n + k ) log n ? ? (or to C (d, ö )(n + k )), where C (d, ö ) = O ((ed/ö )ö ) for ö ? d. Otherwise, C (d, ö ) = O (e d d ö ). We also present experimental results on several large data sets, demonstrating that our algorithms run significantly faster than all the leading existing algorithms for reporting neighbors. c- 2014 Society for Industrial and Applied Mathematics Key words. computational geometry, nearest neighbors, near-neighbor searching, high-dimensional spaces, locality sensitive hashing, random grids author has also been supported by grant 822/10 from the Israel Science Fund and by grant 2006/204 from the U.S.-Israel Binational Science Foundation. Work by Micha Sharir has also been supported by grant 338/09 from the Israel Science Fund, and by the Hermann Minkowski-MINERVA Center for Geometry at Tel Aviv University. The second and third authors have been supported by the Israeli Centers of Research Excellence (I-CORE) program (Center 4/11).

AB - We consider the following problem, which arises in many database and web-based applications: Given a set P of n points in a high-dimensional space Rd and a distance r , we want to report all pairs of points of P at Euclidean distance at most r . We present two randomized algorithms, one based on randomly shifted grids, and the other on randomly shifted and rotated grids. The running time of both algorithms is of the form C (d)(n + k ) log n, where k is the output size and C (d) is a constant that depends on the dimension d. The log n factor is needed to guarantee, with high probability, that all neighbor pairs are reported and can be dropped if it suffices to report, in expectation, an ? d )d for arbitrarily large fraction of the pairs. When only translations are used, C (d) is of the form(a some (small) absolute constant a ? 0. 484; this bound is worst-case tight, up to an exponential factor of about 2d . When both rotations and translations are used, C (d) can be improved to roughly 6.74d, getting rid of the superexponential factor ? d d . When the input set (lies in a subset of d-space that) has low doubling dimension ö , the performance of the first algorithm ? improves to C (d, ö )(n + k ) log n ? ? (or to C (d, ö )(n + k )), where C (d, ö ) = O ((ed/ö )ö ) for ö ? d. Otherwise, C (d, ö ) = O (e d d ö ). We also present experimental results on several large data sets, demonstrating that our algorithms run significantly faster than all the leading existing algorithms for reporting neighbors. c- 2014 Society for Industrial and Applied Mathematics Key words. computational geometry, nearest neighbors, near-neighbor searching, high-dimensional spaces, locality sensitive hashing, random grids author has also been supported by grant 822/10 from the Israel Science Fund and by grant 2006/204 from the U.S.-Israel Binational Science Foundation. Work by Micha Sharir has also been supported by grant 338/09 from the Israel Science Fund, and by the Hermann Minkowski-MINERVA Center for Geometry at Tel Aviv University. The second and third authors have been supported by the Israeli Centers of Research Excellence (I-CORE) program (Center 4/11).

UR - http://www.scopus.com/inward/record.url?scp=84906810443&partnerID=8YFLogxK

U2 - 10.1137/12089867X

DO - 10.1137/12089867X

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:84906810443

SN - 0097-5397

VL - 43

SP - 1363

EP - 1395

JO - SIAM Journal on Computing

JF - SIAM Journal on Computing

IS - 4

ER -