TY - JOUR
T1 - Investigating diversity of clustering methods
T2 - An empirical comparison
AU - Gelbard, Roy
AU - Goldman, Orit
AU - Spiegler, Israel
PY - 2007/10
Y1 - 2007/10
N2 - The paper aims to shed some light on the question why clustering algorithms, despite being quantitative and hence supposedly objective in nature, yield different and varied results. To do that, we took 10 common clustering algorithms and tested them over four known datasets, used in the literature as baselines with agreed upon clusters. One additional method, Binary-Positive, developed by our team, was added to the analysis. The results affirm the unpredictable nature of the clustering process, point to different assumptions taken by different methods. One conclusion of the study is to carefully choose the appropriate clustering method for any given application.
AB - The paper aims to shed some light on the question why clustering algorithms, despite being quantitative and hence supposedly objective in nature, yield different and varied results. To do that, we took 10 common clustering algorithms and tested them over four known datasets, used in the literature as baselines with agreed upon clusters. One additional method, Binary-Positive, developed by our team, was added to the analysis. The results affirm the unpredictable nature of the clustering process, point to different assumptions taken by different methods. One conclusion of the study is to carefully choose the appropriate clustering method for any given application.
KW - Binary-Positive data representation
KW - Cluster analysis
KW - Similarity
UR - http://www.scopus.com/inward/record.url?scp=34250173088&partnerID=8YFLogxK
U2 - 10.1016/j.datak.2007.01.002
DO - 10.1016/j.datak.2007.01.002
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:34250173088
SN - 0169-023X
VL - 63
SP - 155
EP - 166
JO - Data and Knowledge Engineering
JF - Data and Knowledge Engineering
IS - 1
ER -