Space and time efficient kernel density estimation in high dimensions

Arturs Backurs, Piotr Indyk, Tal Wagner

Research output: Contribution to journalConference articlepeer-review

47 Scopus citations

Abstract

Recently, Charikar and Siminelakis (2017) presented a framework for kernel density estimation in provably sublinear query time, for kernels that possess a certain hashing-based property. However, their data structure requires a significantly increased super-linear storage space, as well as super-linear preprocessing time. These limitations inhibit the practical applicability of their approach on large datasets. In this work, we present an improvement to their framework that retains the same query time, while requiring only linear space and linear preprocessing time. We instantiate our framework with the Laplacian and Exponential kernels, two popular kernels which possess the aforementioned property. Our experiments on various datasets verify that our approach attains accuracy and query time similar to Charikar and Siminelakis (2017), with significantly improved space and preprocessing time.

Original languageEnglish
JournalAdvances in Neural Information Processing Systems
Volume32
StatePublished - 2019
Externally publishedYes
Event33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019 - Vancouver, Canada
Duration: 8 Dec 201914 Dec 2019

Fingerprint

Dive into the research topics of 'Space and time efficient kernel density estimation in high dimensions'. Together they form a unique fingerprint.

Cite this