TY - GEN
T1 - Exclusive row biclustering for gene expression using a combinatorial auction approach
AU - Painsky, Amichai
AU - Rosset, Saharon
PY - 2012
Y1 - 2012
N2 - The availability of large microarray data has led to a growing interest in biclustering methods in the past decade. Several algorithms have been proposed to identify subsets of genes and conditions according to different similarity measures and under varying constraints. In this paper we focus on the exclusive row biclusteing problem (also known as projected clustering) for gene expression data sets, in which each row can only be a member of a single bicluster while columns can participate in multiple clusters. This type of biclustering may be adequate, for example, for clustering groups of cancer patients where each patient (row) is expected to be carrying only a single type of cancer, while each cancer type is associated with multiple (and possibly overlapping) genes (columns). In this paper we present a novel method to identify these exclusive row biclusters through a combination of existing biclustering algorithms and combinatorial auction techniques. We devise an approach for tuning the threshold for our algorithm based on comparison to a null model in the spirit of the Gap statistic approach [11]. We demonstrate our approach on both synthetic and real-world gene expression data and show its power in identifying large span nonoverlapping rows sub matrices, while considering their unique nature. The Gap statistic approach succeeds in identifying appropriate thresholds in all our examples.
AB - The availability of large microarray data has led to a growing interest in biclustering methods in the past decade. Several algorithms have been proposed to identify subsets of genes and conditions according to different similarity measures and under varying constraints. In this paper we focus on the exclusive row biclusteing problem (also known as projected clustering) for gene expression data sets, in which each row can only be a member of a single bicluster while columns can participate in multiple clusters. This type of biclustering may be adequate, for example, for clustering groups of cancer patients where each patient (row) is expected to be carrying only a single type of cancer, while each cancer type is associated with multiple (and possibly overlapping) genes (columns). In this paper we present a novel method to identify these exclusive row biclusters through a combination of existing biclustering algorithms and combinatorial auction techniques. We devise an approach for tuning the threshold for our algorithm based on comparison to a null model in the spirit of the Gap statistic approach [11]. We demonstrate our approach on both synthetic and real-world gene expression data and show its power in identifying large span nonoverlapping rows sub matrices, while considering their unique nature. The Gap statistic approach succeeds in identifying appropriate thresholds in all our examples.
KW - Biclustering
KW - Exclusive row biclustering
KW - Gene expression
KW - Projected clustering
UR - http://www.scopus.com/inward/record.url?scp=84874039514&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2012.25
DO - 10.1109/ICDM.2012.25
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84874039514
SN - 9780769549057
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 1056
EP - 1061
BT - Proceedings - 12th IEEE International Conference on Data Mining, ICDM 2012
T2 - 12th IEEE International Conference on Data Mining, ICDM 2012
Y2 - 10 December 2012 through 13 December 2012
ER -