TY - JOUR
T1 - Assessing the prediction fidelity of ancestral reconstruction by a library approach
AU - Bar-Rogovsky, Hagit
AU - Stern, Adi
AU - Penn, Osnat
AU - Kobl, Iris
AU - Pupko, Tal
AU - Tawfik, Dan S.
N1 - Publisher Copyright:
© The Author 2015. Published by Oxford University Press. All rights reserved.
PY - 2015/11
Y1 - 2015/11
N2 - Ancestral reconstruction is a powerful tool for studying protein evolution as well as for protein design and engineering. However, in many positions alternative predictions with relatively high marginal probabilities exist, and thus the prediction comprises an ensemble of near-ancestor sequences that relate to the historical ancestor. The ancestral phenotype should therefore be explored for the entire ensemble, rather than for the sequence comprising the most probable amino acid at all positions [the most probable ancestor (mpa)]. To this end, we constructed libraries that sample ensembles of near-ancestor sequences. Specifically, we identified positions where alternatively predicted amino acids are likely to affect the ancestor's structure and/or function. Using the serum paraoxoases (PONs) enzyme family as a test case, we constructed libraries that combinatorially sample these alternatives. We next characterized these libraries, reflecting the vertebrate and mammalian PON ancestors. We found that the mpa of vertebrate PONs represented only one out of many different enzymatic phenotypes displayed by its ensemble. The mammalian ancestral library, however, exhibited a homogeneous phenotype that was well represented by the mpa. Our library design strategy that samples near-ancestor ensembles at potentially critical positions therefore provides a systematic way of examining the robustness of inferred ancestral phenotypes.
AB - Ancestral reconstruction is a powerful tool for studying protein evolution as well as for protein design and engineering. However, in many positions alternative predictions with relatively high marginal probabilities exist, and thus the prediction comprises an ensemble of near-ancestor sequences that relate to the historical ancestor. The ancestral phenotype should therefore be explored for the entire ensemble, rather than for the sequence comprising the most probable amino acid at all positions [the most probable ancestor (mpa)]. To this end, we constructed libraries that sample ensembles of near-ancestor sequences. Specifically, we identified positions where alternatively predicted amino acids are likely to affect the ancestor's structure and/or function. Using the serum paraoxoases (PONs) enzyme family as a test case, we constructed libraries that combinatorially sample these alternatives. We next characterized these libraries, reflecting the vertebrate and mammalian PON ancestors. We found that the mpa of vertebrate PONs represented only one out of many different enzymatic phenotypes displayed by its ensemble. The mammalian ancestral library, however, exhibited a homogeneous phenotype that was well represented by the mpa. Our library design strategy that samples near-ancestor ensembles at potentially critical positions therefore provides a systematic way of examining the robustness of inferred ancestral phenotypes.
KW - Ancestral sequence reconstruction
KW - Inferred ancestor
KW - Predicted ancestor
KW - Serum paraoxonase
UR - http://www.scopus.com/inward/record.url?scp=84948438701&partnerID=8YFLogxK
U2 - 10.1093/protein/gzv038
DO - 10.1093/protein/gzv038
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 26275856
AN - SCOPUS:84948438701
SN - 1741-0126
VL - 28
SP - 507
EP - 518
JO - Protein Engineering, Design and Selection
JF - Protein Engineering, Design and Selection
IS - 11
ER -