Reconstruction of ancestral genomic sequences using likelihood

Isaac Elias*, Tamir Tuller

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

A challenging task in computational biology is the reconstruction of genomic sequences of extinct ancestors, given the phylogenetic tree and the sequences at the leafs. This task is best solved by calculating the most likely estimate of the ancestral sequences, along with the most likely edge lengths. We deal with this problem and also the variant in which the phylogenetic tree in addition to the ancestral sequences need to be estimated. The latter problem is known to be NP-hard, while the computational complexity of the former is unknown. Currently, all algorithms for solving these problems are heuristics without performance guarantees. The biological importance of these problems calls for developing better algorithms with guarantees of finding either optimal or approximate solutions. We develop approximation, fix parameter tractable (FPT), and fast heuristic algorithms for two variants of the problem; when the phylogenetic tree is known and when it is unknown. The approximation algorithm guarantees a solution with a log-likelihood ratio of 2 relative to the optimal solution. The FPT has a running time which is polynomial in the length off the sequences and exponential in the number of taxa. This makes it useful for calculating the optimal solution for small trees. Moreover, we combine the approximation algorithm and the FPT into an algorithm with arbitrary good approximation guarantee (PTAS). We tested our algorithms on both synthetic and biological data. In particular, we used the FPT for computing the most likely ancestral mitochondrial genomes of hominidae (the great apes), thereby answering an-interesting biological question. Moreover, we show how the approximation algorithms find good solutions for reconstructing the ancestral genomes for a set of lentiviruses (relatives of HIV). Supplementary material of this work is available at www.nada.kth.se/~isaac/publications/aml/aml.html.

Original languageEnglish
Pages (from-to)216-237
Number of pages22
JournalJournal of Computational Biology
Volume14
Issue number2
DOIs
StatePublished - Mar 2007

Keywords

  • 2-approximation
  • Ancestral maximum likelihood
  • FPT
  • Most parsimonious likelihood
  • PTAS

Fingerprint

Dive into the research topics of 'Reconstruction of ancestral genomic sequences using likelihood'. Together they form a unique fingerprint.

Cite this