Abstract
Traditionally, clustering problems are investigated under the assumption that all objects must be clustered. A shortcoming of this formulation is that a few distant objects, called outliers, may exert a disproportionately strong influence over the solution. In this work we investigate the k-MIN-SUM clustering problem while addressing outliers in a meaningful way. Given a complete graph G = (V, E), a weight function w: E → IN0 on its edges, and p → IN0 a penalty function on its nodes, the PENALIZED k-MIN-SUM PROBLEM is the problem of finding a partition of V to k + 1 sets, {S1,..., Sk+1), minimizing Σ i=1k w(Si)+p(Sk+1), where for S ⊆ V w(S) = Σe={i, j}⊂S we, and p(S) = ΣiεS pi. We offer an efficient 2-approximation to the penalized 1-min-sum problem using a primal-dual algorithm. We prove that the penalized 1-min-sum problem is NP-hard even if tu is a metric and present a randomized approximation scheme for it. For the metric penalized k-min-sum problem we offer a 2-approximation.
Original language | English |
---|---|
Pages (from-to) | 167-178 |
Number of pages | 12 |
Journal | Lecture Notes in Computer Science |
Volume | 3669 |
DOIs | |
State | Published - 2005 |
Event | 13th Annual European Symposium on Algorithms, ESA 2005 - Palma de Mallorca, Spain Duration: 3 Oct 2005 → 6 Oct 2005 |