Representative Selection in Nonmetric Datasets

Elad Liebman*, Benny Chor, Peter Stone

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

This study considers the problem of representative selection: choosing a subset of data points from a dataset that best represents its overall set of elements. This subset needs to inherently reflect the type of information contained in the entire set, while minimizing redundancy. For such purposes, clustering might seem like a natural approach. However, existing clustering methods are not ideally suited for representative selection, especially when dealing with nonmetric data, in which only a pairwise similarity measure exists. In this article we propose δ-medoids, a novel approach that can be viewed as an extension of the k-medoids algorithm and is specifically suited for sample representative selection from nonmetric data. We empirically validate δ-medoids in two domains: music analysis and motion analysis. We also show some theoretical bounds on the performance of δ-medoids and the hardness of representative selection in general.

Original languageEnglish
Pages (from-to)807-838
Number of pages32
JournalApplied Artificial Intelligence
Volume29
Issue number8
DOIs
StatePublished - 14 Sep 2015

Funding

FundersFunder number
Office of Naval Research21C184-01
Air Force Office of Scientific ResearchFA9550-14-1-0087
Air Force Research LaboratoryFA8750-14-1-0070
National Science Foundation1330072, 1305287, CNS-1305287, CNS-1330072

    Fingerprint

    Dive into the research topics of 'Representative Selection in Nonmetric Datasets'. Together they form a unique fingerprint.

    Cite this