Abstract
We propose a new formulation of the conceptual clustering problem where the goal is to explicitly output a collection of simple and meaningful conjunctions of attributes that define the clusters. The formulation differs from previous approaches since the clusters discovered may overlap and also may not cover all the points. In addition, a point may be assigned to a cluster description even if it only satisfies most, and not necessarily all, of the attributes in the conjunction. Connections between this conceptual clustering problem and the maximum edge biclique problem are made. Simple, randomized algorithms are given that discover a collection of approximate conjunctive cluster descriptions in sublinear time.
Original language | English |
---|---|
Pages (from-to) | 115-151 |
Number of pages | 37 |
Journal | Machine Learning |
Volume | 56 |
Issue number | 1-3 |
DOIs | |
State | Published - Jul 2004 |
Keywords
- Conceptual clustering
- Maximum edge biclustering