## Abstract

Traditionally, clustering problems are investigated under the assumption that all objects must be clustered. A shortcoming of this formulation is that a few distant objects, called outliers, may exert a disproportionately strong influence over the solution. In this work we investigate the k-MIN-SUM clustering problem while addressing outliers in a meaningful way. Given a complete graph G = (V, E), a weight function w: E → IN_{0} on its edges, and p → IN_{0} a penalty function on its nodes, the PENALIZED k-MIN-SUM PROBLEM is the problem of finding a partition of V to k + 1 sets, {S_{1},..., S_{k+1}), minimizing Σ _{i=1}^{k} w(S_{i})+p(S_{k+1}), where for S ⊆ V w(S) = Σ_{e={i, j}⊂}S w_{e}, and p(S) = Σ_{iεS} p_{i}. We offer an efficient 2-approximation to the penalized 1-min-sum problem using a primal-dual algorithm. We prove that the penalized 1-min-sum problem is NP-hard even if tu is a metric and present a randomized approximation scheme for it. For the metric penalized k-min-sum problem we offer a 2-approximation.

Original language | English |
---|---|

Pages (from-to) | 167-178 |

Number of pages | 12 |

Journal | Lecture Notes in Computer Science |

Volume | 3669 |

DOIs | |

State | Published - 2005 |

Event | 13th Annual European Symposium on Algorithms, ESA 2005 - Palma de Mallorca, Spain Duration: 3 Oct 2005 → 6 Oct 2005 |