Harnessing machine learning to guide phylogenetic-tree search algorithms

Dana Azouri, Shiran Abadi, Yishay Mansour, Itay Mayrose*, Tal Pupko*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Inferring a phylogenetic tree is a fundamental challenge in evolutionary studies. Current paradigms for phylogenetic tree reconstruction rely on performing costly likelihood optimizations. With the aim of making tree inference feasible for problems involving more than a handful of sequences, inference under the maximum-likelihood paradigm integrates heuristic approaches to evaluate only a subset of all potential trees. Consequently, existing methods suffer from the known tradeoff between accuracy and running time. In this proof-of-concept study, we train a machine-learning algorithm over an extensive cohort of empirical data to predict the neighboring trees that increase the likelihood, without actually computing their likelihood. This provides means to safely discard a large set of the search space, thus potentially accelerating heuristic tree searches without losing accuracy. Our analyses suggest that machine learning can guide tree-search methodologies towards the most promising candidate trees.

Original languageEnglish
Article number1983
JournalNature Communications
Volume12
Issue number1
DOIs
StatePublished - 1 Dec 2021

Funding

FundersFunder number
Edmond J. Safra Center for Bioinformatics
Rothschild Caesarea Foundation
Israel Science Foundation802/16, 961/17, 993/17
Tel Aviv University
Council for Higher Education

    Fingerprint

    Dive into the research topics of 'Harnessing machine learning to guide phylogenetic-tree search algorithms'. Together they form a unique fingerprint.

    Cite this