TY - JOUR

T1 - Analytic solutions of maximum likelihood on forks of four taxa

AU - Chor, Benny

AU - Snir, Sagi

N1 - Funding Information:
Research supported by ISF grant 418/00. Part of these results were presented at the RECOMB 2003 conference in Berlin.

PY - 2007/8

Y1 - 2007/8

N2 - This work deals with symbolic mathematical solutions to maximum likelihood on small phylogenetic trees. Maximum likelihood (ML) is increasingly used as an optimality criterion for selecting evolutionary trees, but finding the global optimum is a hard computational task. In this work, we give general analytic solutions for a family of trees with four taxa, two state characters, under a molecular clock. Previously, analytical solutions were known only for three taxa trees. The change from three to four taxa incurs a major increase in the complexity of the underlying algebraic system, and requires novel techniques and approaches. Despite the simplicity of our model, solving ML analytically in it is close to the limit of today's tractability. Four taxa rooted trees have two topologies - the fork (two subtrees with two leaves each) and the comb (one subtree with three leaves, the other with a single leaf). Combining the properties of molecular clock fork trees with the Hadamard conjugation, and employing the symbolic algebra software Maple, we derive a number of topology dependent identities. Using these identities, we substantially simplify the system of polynomial equations for the fork. We finally employ the symbolic algebra software to obtain closed form analytic solutions (expressed parametrically in the input data).

AB - This work deals with symbolic mathematical solutions to maximum likelihood on small phylogenetic trees. Maximum likelihood (ML) is increasingly used as an optimality criterion for selecting evolutionary trees, but finding the global optimum is a hard computational task. In this work, we give general analytic solutions for a family of trees with four taxa, two state characters, under a molecular clock. Previously, analytical solutions were known only for three taxa trees. The change from three to four taxa incurs a major increase in the complexity of the underlying algebraic system, and requires novel techniques and approaches. Despite the simplicity of our model, solving ML analytically in it is close to the limit of today's tractability. Four taxa rooted trees have two topologies - the fork (two subtrees with two leaves each) and the comb (one subtree with three leaves, the other with a single leaf). Combining the properties of molecular clock fork trees with the Hadamard conjugation, and employing the symbolic algebra software Maple, we derive a number of topology dependent identities. Using these identities, we substantially simplify the system of polynomial equations for the fork. We finally employ the symbolic algebra software to obtain closed form analytic solutions (expressed parametrically in the input data).

KW - Analytic solutions

KW - Hadamard conjugation

KW - Maximum likelihood

KW - Molecular clock

KW - Phylogenetic trees

KW - Symbolic manipulation

UR - http://www.scopus.com/inward/record.url?scp=34547103029&partnerID=8YFLogxK

U2 - 10.1016/j.mbs.2006.04.001

DO - 10.1016/j.mbs.2006.04.001

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

C2 - 17664091

AN - SCOPUS:34547103029

SN - 0025-5564

VL - 208

SP - 347

EP - 358

JO - Mathematical Biosciences

JF - Mathematical Biosciences

IS - 2

ER -