Do tree split probabilities determine the branch lengths?

Benny Chor, Mike Steel*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The evolution of aligned DNA sequence sites is generally modeled by a Markov process operating along the edges of a phylogenetic tree. It is well known that the probability distribution on the site patterns at the tips of the tree determines the tree topology, and its branch lengths. However, the number of patterns is typically much larger than the number of edges, suggesting considerable redundancy in the branch length estimation. In this paper we ask whether the probabilities of just the 'edge-specific' patterns (the ones that correspond to a change of state on a single edge) suffice to recover the branch lengths of the tree, under a symmetric 2-state Markov process. We first show that this holds provided the branch lengths are sufficiently short, by applying the inverse function theorem. We then consider whether this restriction to short branch lengths is necessary. We show that for trees with up to four leaves it can be lifted. This leaves open the interesting question of whether this holds in general. Our results also extend to certain Markov processes on more than 2-states, such as the Jukes-Cantor model.

Original languageEnglish
Pages (from-to)54-59
Number of pages6
JournalJournal of Theoretical Biology
Volume374
DOIs
StatePublished - 7 Jun 2015

Funding

FundersFunder number
Israeli Science Foundation
Allan Wilson Centre

    Keywords

    • Evolutionary model
    • Hadamard transform
    • Inverse function theorem
    • Markov process
    • Phylogenetic tree reconstruction
    • Systems of polynomial equations

    Fingerprint

    Dive into the research topics of 'Do tree split probabilities determine the branch lengths?'. Together they form a unique fingerprint.

    Cite this