TY - JOUR
T1 - A likelihood framework to analyse phyletic patterns
AU - Cohen, Ofir
AU - Rubinstein, Nimrod D.
AU - Stern, Adi
AU - Gophna, Uri
AU - Pupko, Tal
PY - 2008/12/27
Y1 - 2008/12/27
N2 - Probabilistic evolutionary models revolutionized our capability to extract biological insights from sequence data. While these models accurately describe the stochastic processes of site-specific substitutions, single-base substitutions represent only a fraction of all the events that shape genomes. Specifically, in microbes, events in which entire genes are gained (e.g. via horizontal gene transfer) and lost play a pivotal evolutionary role. In this research, we present a novel likelihood-based evolutionary model for gene gains and losses, and use it to analyse genome-wide patterns of the presence and absence of gene families. The model assumes a Markovian stochastic process, where gains and losses are represented by the transition between presence and absence, respectively, given an underlying phylogenetic tree. To account for differences in the rates of gain and loss of different gene families, we assume among-gene family rate variability, thus allowing for more accurate description of the data. Using the Bayesian approach, we estimated an evolutionary rate for each gene family. Simulation studies demonstrated that our methodology accurately infers these rates. Our methodology was applied to analyse a large corpus of data, consisting of 4873 gene families spanning 63 species and revealed novel insights regarding the evolutionary nature of genome-wide gain and loss dynamics.
AB - Probabilistic evolutionary models revolutionized our capability to extract biological insights from sequence data. While these models accurately describe the stochastic processes of site-specific substitutions, single-base substitutions represent only a fraction of all the events that shape genomes. Specifically, in microbes, events in which entire genes are gained (e.g. via horizontal gene transfer) and lost play a pivotal evolutionary role. In this research, we present a novel likelihood-based evolutionary model for gene gains and losses, and use it to analyse genome-wide patterns of the presence and absence of gene families. The model assumes a Markovian stochastic process, where gains and losses are represented by the transition between presence and absence, respectively, given an underlying phylogenetic tree. To account for differences in the rates of gain and loss of different gene families, we assume among-gene family rate variability, thus allowing for more accurate description of the data. Using the Bayesian approach, we estimated an evolutionary rate for each gene family. Simulation studies demonstrated that our methodology accurately infers these rates. Our methodology was applied to analyse a large corpus of data, consisting of 4873 gene families spanning 63 species and revealed novel insights regarding the evolutionary nature of genome-wide gain and loss dynamics.
KW - Gene content
KW - Gene gain and loss
KW - Genome evolution
KW - Horizontal gene transfer
KW - Phyletic pattern
KW - Probabilistic evolutionary models
UR - http://www.scopus.com/inward/record.url?scp=57149086448&partnerID=8YFLogxK
U2 - 10.1098/rstb.2008.0177
DO - 10.1098/rstb.2008.0177
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:57149086448
SN - 0962-8436
VL - 363
SP - 3903
EP - 3911
JO - Philosophical Transactions of the Royal Society B: Biological Sciences
JF - Philosophical Transactions of the Royal Society B: Biological Sciences
IS - 1512
ER -