Abstract
We have developed a novel algorithm for cluster analysis that is based on graph theoretic techniques. A similarity graph is defined and clusters in that graph correspond to highly connected subgraphs. A polynomial algorithm to compute them efficiently is presented. Our algorithm produces a clustering with some probably good properties. The application that motivated this study was gene expression analysis, where a collection of cDNAs must be clustered based on their oligonucleotide fingerprints. The algorithm has been tested intensively on simulated libraries and was shown to outperform extant methods. It demonstrated robustness to high noise levels. In a blind test on real cDNA fingerprint data the algorithm obtained very good results. Utilizing the results of the algorithm would have saved over 70% of the cDNA sequencing cost on that data set.
Original language | English |
---|---|
Pages | 188-197 |
Number of pages | 10 |
DOIs | |
State | Published - 1999 |
Event | Proceedings of the 1999 3rd Annual International Conference on Computational Molecular Biology, RECOMB '99 - Lyon Duration: 11 Apr 1999 → 14 Apr 1999 |
Conference
Conference | Proceedings of the 1999 3rd Annual International Conference on Computational Molecular Biology, RECOMB '99 |
---|---|
City | Lyon |
Period | 11/04/99 → 14/04/99 |