EvoRator: Prediction of Residue-level Evolutionary Rates from Protein Structures Using Machine Learning

Natan Nagar, Nir Ben Tal, Tal Pupko*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Measuring evolutionary rates at the residue level is indispensable for gaining structural and functional insights into proteins. State-of-the-art tools for estimating rates take as input a large set of homologous proteins, a probabilistic model of evolution and a phylogenetic tree. However, a gap exists when only few or no homologous proteins can be found, e.g., orphan proteins. In addition, such tools do not take the three-dimensional (3D) structure of the protein into account. The association between the 3D structure and site-specific rates can be learned using machine-learning regression tools from a cohort of proteins for which both the structure and a large set of homologs exist. Here we present EvoRator, a user-friendly web server that implements a machine-learning regression algorithm to predict site-specific evolutionary rates from protein structures. We show that EvoRator outperforms predictions obtained using traditional physicochemical features, such as relative solvent accessibility and weighted contact number. We also demonstrate the application of EvoRator in three common scenarios that arise in protein evolution research: (1) orphan proteins for which no (or few) homologs exist; (2) When homologous sequences exist, our algorithm contrasts structure-based estimates of the evolutionary rates and the phylogeny-based estimates. This allows detecting sites that are likely conserved due to functional rather than structural constraints; (3) Algorithms that only rely on homologous sequence often fail to accurately measure the evolutionary rates of positions in gapped sequence alignments, which frequently occurs as a result of a clade-specific insertion. Our algorithm makes use of training data and known 3D structure of such gapped positions to predict their evolutionary rates. EvoRator is freely available for all users at: https://evorator.tau.ac.il/.

Original languageEnglish
Article number167538
JournalJournal of Molecular Biology
Volume434
Issue number11
DOIs
StatePublished - 15 Jun 2022

Funding

FundersFunder number
Abraham E. Kazan Chair
Israel Science Foundation2818/21, 1764/21
Tel Aviv University

    Keywords

    • ConSurf
    • gapped alignment
    • machine learning
    • orphan genes
    • protein evolution
    • protein function
    • protein structure

    Fingerprint

    Dive into the research topics of 'EvoRator: Prediction of Residue-level Evolutionary Rates from Protein Structures Using Machine Learning'. Together they form a unique fingerprint.

    Cite this