TY - JOUR
T1 - Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data
AU - Tanay, Amos
AU - Sharan, Roded
AU - Kupiec, Martin
AU - Shamir, Ron
PY - 2004/3/2
Y1 - 2004/3/2
N2 - The dissection of complex biological systems is a challenging task, made difficult by the size of the underlying molecular network and the heterogeneous nature of the control mechanisms involved. Novel high-throughput techniques are generating massive data sets on various aspects of such systems. Here, we perform analysis of a highly diverse collection of genomewide data sets, including gene expression, protein interactions, growth phenotype data, and transcription factor binding, to reveal the modular organization of the yeast system. By integrating experimental data of heterogeneous sources and types, we are able to perform analysis on a much broader scope than previous studies. At the core of our methodology is the ability to identify modules, namely, groups of genes with statistically significant correlated behavior across diverse data sources. Numerous biological processes are revealed through these modules, which also obey global hierarchical organization. We use the identified modules to study the yeast transcriptional network and predict the function of >800 uncharacterized genes. Our analysis framework, SAMBA (Statistical-Algorithmic Method for Bicluster Analysis), enables the processing of current and future sources of biological information and is readily extendable to experimental techniques and higher organisms.
AB - The dissection of complex biological systems is a challenging task, made difficult by the size of the underlying molecular network and the heterogeneous nature of the control mechanisms involved. Novel high-throughput techniques are generating massive data sets on various aspects of such systems. Here, we perform analysis of a highly diverse collection of genomewide data sets, including gene expression, protein interactions, growth phenotype data, and transcription factor binding, to reveal the modular organization of the yeast system. By integrating experimental data of heterogeneous sources and types, we are able to perform analysis on a much broader scope than previous studies. At the core of our methodology is the ability to identify modules, namely, groups of genes with statistically significant correlated behavior across diverse data sources. Numerous biological processes are revealed through these modules, which also obey global hierarchical organization. We use the identified modules to study the yeast transcriptional network and predict the function of >800 uncharacterized genes. Our analysis framework, SAMBA (Statistical-Algorithmic Method for Bicluster Analysis), enables the processing of current and future sources of biological information and is readily extendable to experimental techniques and higher organisms.
UR - http://www.scopus.com/inward/record.url?scp=1542357674&partnerID=8YFLogxK
U2 - 10.1073/pnas.0308661100
DO - 10.1073/pnas.0308661100
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:1542357674
SN - 0027-8424
VL - 101
SP - 2981
EP - 2986
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 9
ER -