TY - GEN
T1 - A chemical-distance-based test for positive Darwinian selection
AU - Pupko, Tal
AU - Sharan, Roded
AU - Hasegawa, Masami
AU - Shamir, Ron
AU - Graur, Dan
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2001.
PY - 2001
Y1 - 2001
N2 - There are very few instances in which positive Darwinian selection has been convincingly demonstrated at the molecular level. In this study, we present a novel test for detecting positive selection at the amino-acid level. In this test, amino-acid replacements are characterized in terms of chemical distances, i.e., degrees of dissimilarity between the exchanged residues in a protein. The test identifies statistically significant deviations of the mean observed chemical distance from its expectation, either along a phylogenetic lineage or across a subtree. The mean observed distance is calculated as the average chemical distance over all possible ancestral sequence reconstructions, weighted by their likelihood. Our method substantially improves over previous approaches by taking into account the stochastic process, tree phylogeny, among site rate variation, and alternative ancestral reconstructions. We provide a linear time algorithm for applying this test to all branches and all subtrees of a given phylogenetic tree. We validate this approach by applying it to two well-studied datasets, the MHC class I glycoproteins serving as a positive control, and the house-keeping gene carbonic anhydrase I serving as a negative control.
AB - There are very few instances in which positive Darwinian selection has been convincingly demonstrated at the molecular level. In this study, we present a novel test for detecting positive selection at the amino-acid level. In this test, amino-acid replacements are characterized in terms of chemical distances, i.e., degrees of dissimilarity between the exchanged residues in a protein. The test identifies statistically significant deviations of the mean observed chemical distance from its expectation, either along a phylogenetic lineage or across a subtree. The mean observed distance is calculated as the average chemical distance over all possible ancestral sequence reconstructions, weighted by their likelihood. Our method substantially improves over previous approaches by taking into account the stochastic process, tree phylogeny, among site rate variation, and alternative ancestral reconstructions. We provide a linear time algorithm for applying this test to all branches and all subtrees of a given phylogenetic tree. We validate this approach by applying it to two well-studied datasets, the MHC class I glycoproteins serving as a positive control, and the house-keeping gene carbonic anhydrase I serving as a negative control.
UR - http://www.scopus.com/inward/record.url?scp=23044531504&partnerID=8YFLogxK
U2 - 10.1007/3-540-44696-6_11
DO - 10.1007/3-540-44696-6_11
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:23044531504
SN - 3540425160
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 142
EP - 155
BT - Algorithms in Bioinformatics - First International Workshop, WABI 2001 Århus Denmark, August 28-31, 2001 Proceedings
A2 - Moret, Bernard M. E.
A2 - Gascuel, Olivier
PB - Springer Verlag
T2 - 1st International Workshop on Algorithms in Bioinformatics, WABI 2001
Y2 - 28 August 2001 through 31 August 2001
ER -