Maximum likelihood of evolutionary trees: Hardness and approximation

Research output: Contribution to journalArticlepeer-review

Abstract

Motivation: Maximum likelihood (ML) is an increasingly popular optimality criterion for selecting evolutionary trees. Yet the computational complexity of ML was open for over 20 years, and only recently resolved by the authors for the Jukes-Cantor model of substitution and its generalizations. It was proved that reconstructing the ML tree is computationally intractable (NP-hard). In this work we explore three directions, which extend that result. Results: (1) We show that ML under the assumption of molecular clock is still computationally intractable (NP-hard). (2) We show that not only is it computationally intractable to find the exact ML tree, even approximating the logarithm of the ML for any multiplicative factor smaller than 1.00175 is computationally intractable. (3) We develop an algorithm for approximating log-likelihood under the condition that the input sequences are sparse. It employs any approximation algorithm for parsimony, and asymptotically achieves the same approximation ratio. We note that ML reconstruction for sparse inputs is still hard under this condition, and furthermore many real datasets satisfy it.

Original languageEnglish
Pages (from-to)i97-i106
JournalBioinformatics
Volume21
Issue numberSUPPL. 1
DOIs
StatePublished - Jun 2005

Fingerprint

Dive into the research topics of 'Maximum likelihood of evolutionary trees: Hardness and approximation'. Together they form a unique fingerprint.

Cite this