TY - JOUR

T1 - Generative probabilistic models for protein-protein interaction networks-the biclique perspective

AU - Schweiger, Regev

AU - Linial, Michal

AU - Linial, Nathan

N1 - Funding Information:
Funding: Prospects (EU, Framework VII, partially); Israel Science Foundation (ISF 592/07); Fellowship from the SCCB, the Sudarsky Center for Computational Biology (to R.S.).

PY - 2011/7

Y1 - 2011/7

N2 - Motivation: Much of the large-scale molecular data from living cells can be represented in terms of networks. Such networks occupy a central position in cellular systems biology. In the protein-protein interaction (PPI) network, nodes represent proteins and edges represent connections between them, based on experimental evidence. As PPI networks are rich and complex, a mathematical model is sought to capture their properties and shed light on PPI evolution. The mathematical literature contains various generative models of random graphs. It is a major, still largely open question, which of these models (if any) can properly reproduce various biologically interesting networks. Here, we consider this problem where the graph at hand is the PPI network of Saccharomyces cerevisiae. We are trying to distinguishing between a model family which performs a process of copying neighbors, represented by the duplication-divergence (DD) model, and models which do not copy neighbors, with the Barabási-Albert (BA) preferential attachment model as a leading example. Results: The observed property of the network is the distribution of maximal bicliques in the graph. This is a novel criterion to distinguish between models in this area. It is particularly appropriate for this purpose, since it reflects the graph's growth pattern under either model. This test clearly favors the DD model. In particular, for the BA model, the vast majority (92.9%) of the bicliques with both sides ≥4 must be already embedded in the model's seed graph, whereas the corresponding figure for the DD model is only 5.1%. Our results, based on the biclique perspective, conclusively show that a naïve unmodified DD model can capture a key aspect of PPI networks.

AB - Motivation: Much of the large-scale molecular data from living cells can be represented in terms of networks. Such networks occupy a central position in cellular systems biology. In the protein-protein interaction (PPI) network, nodes represent proteins and edges represent connections between them, based on experimental evidence. As PPI networks are rich and complex, a mathematical model is sought to capture their properties and shed light on PPI evolution. The mathematical literature contains various generative models of random graphs. It is a major, still largely open question, which of these models (if any) can properly reproduce various biologically interesting networks. Here, we consider this problem where the graph at hand is the PPI network of Saccharomyces cerevisiae. We are trying to distinguishing between a model family which performs a process of copying neighbors, represented by the duplication-divergence (DD) model, and models which do not copy neighbors, with the Barabási-Albert (BA) preferential attachment model as a leading example. Results: The observed property of the network is the distribution of maximal bicliques in the graph. This is a novel criterion to distinguish between models in this area. It is particularly appropriate for this purpose, since it reflects the graph's growth pattern under either model. This test clearly favors the DD model. In particular, for the BA model, the vast majority (92.9%) of the bicliques with both sides ≥4 must be already embedded in the model's seed graph, whereas the corresponding figure for the DD model is only 5.1%. Our results, based on the biclique perspective, conclusively show that a naïve unmodified DD model can capture a key aspect of PPI networks.

UR - http://www.scopus.com/inward/record.url?scp=79959415714&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btr201

DO - 10.1093/bioinformatics/btr201

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

C2 - 21685063

AN - SCOPUS:79959415714

SN - 1367-4803

VL - 27

SP - i142-i148

JO - Bioinformatics

JF - Bioinformatics

IS - 13

M1 - btr201

ER -