Ancestral sequence reconstruction: Accounting for structural information by averaging over replacement matrices

Asher Moshe, Tal Pupko*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Motivation: Ancestral sequence reconstruction (ASR) is widely used to understand protein evolution, structure and function. Current ASR methodologies do not fully consider differences in evolutionary constraints among positions imposed by the three-dimensional (3D) structure of the protein. Here, we developed an ASR algorithm that allows different protein sites to evolve according to different mixtures of replacement matrices. We show that assigning replacement matrices to protein positions based on their solvent accessibility leads to ASR with higher log-likelihoods compared to naïve models that assume a single replacement matrix for all sites. Improved ASR log-likelihoods are also demonstrated when solvent accessibility is predicted from protein sequences rather than inferred from a known 3D structure. Finally, we show that using such structure-aware mixture models results in substantial differences in the inferred ancestral sequences. Availability and implementation: http://fastml.tau.ac.il. Supplementary information: Supplementary data are available at Bioinformatics online.

Original languageEnglish
Pages (from-to)2562-2568
Number of pages7
JournalBioinformatics
Volume35
Issue number15
DOIs
StatePublished - 1 Aug 2019

Fingerprint

Dive into the research topics of 'Ancestral sequence reconstruction: Accounting for structural information by averaging over replacement matrices'. Together they form a unique fingerprint.

Cite this