TY - JOUR
T1 - Same-cluster querying for overlapping clusters
AU - Huleihel, Wasim
AU - Mazumdar, Arya
AU - Médard, Muriel
AU - Pal, Soumyabrata
N1 - Publisher Copyright:
© 2019 Neural information processing systems foundation. All rights reserved.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2019
Y1 - 2019
N2 - Overlapping clusters are common in models of many practical data-segmentation applications. Suppose we are given n elements to be clustered into k possibly overlapping clusters, and an oracle that can interactively answer queries of the form “do elements u and v belong to the same cluster?” The goal is to recover the clusters with minimum number of such queries. This problem has been of recent interest for the case of disjoint clusters. In this paper, we look at the more practical scenario of overlapping clusters, and provide upper bounds (with algorithms) on the sufficient number of queries. We provide algorithmic results under both arbitrary (worst-case) and statistical modeling assumptions. Our algorithms are parameter free, efficient, and work in the presence of random noise. We also derive information-theoretic lower bounds on the number of queries needed, proving that our algorithms are order optimal. Finally, we test our algorithms over both synthetic and real-world data, showing their practicality and effectiveness.
AB - Overlapping clusters are common in models of many practical data-segmentation applications. Suppose we are given n elements to be clustered into k possibly overlapping clusters, and an oracle that can interactively answer queries of the form “do elements u and v belong to the same cluster?” The goal is to recover the clusters with minimum number of such queries. This problem has been of recent interest for the case of disjoint clusters. In this paper, we look at the more practical scenario of overlapping clusters, and provide upper bounds (with algorithms) on the sufficient number of queries. We provide algorithmic results under both arbitrary (worst-case) and statistical modeling assumptions. Our algorithms are parameter free, efficient, and work in the presence of random noise. We also derive information-theoretic lower bounds on the number of queries needed, proving that our algorithms are order optimal. Finally, we test our algorithms over both synthetic and real-world data, showing their practicality and effectiveness.
UR - http://www.scopus.com/inward/record.url?scp=85090171880&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.conferencearticle???
AN - SCOPUS:85090171880
SN - 1049-5258
VL - 32
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019
Y2 - 8 December 2019 through 14 December 2019
ER -