Incomplete directed perfect phylogeny

Itsik Pe'er, Tal Pupko, Ron Shamir, Roded Sharan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

51 Scopus citations

Abstract

Perfect phylogeny is one of the fundamental models for studying evolution. We investigate the following variant of the model: The input is a species-characters matrix. The characters are binary and directed; i.e., a species can only gain characters. The difference from standard perfect phylogeny is that for some species the states of some characters are unknown. The question is whether one can complete the missing states in a way that admits a perfect phylogeny. The problem arises in classical phylogenetic studies, when some states are missing or undetermined. Quite recently, studies that infer phylogenies using inserted repeat elements in DNA gave rise to the same problem. Extant solutions for it take time O(n 2m) for n species and m characters. We provide a graph theoretic formulation of the problem as a graph sandwich problem, and give near-optimal O(nm)-time algorithms for the problem. We also study the problem of finding a single, general solution tree, from which any other solution can be obtained by node splitting. We provide an algorithm to construct such a tree, or determine that none exists.

Original languageEnglish
Pages (from-to)590-607
Number of pages18
JournalSIAM Journal on Computing
Volume33
Issue number3
DOIs
StatePublished - 2004

Keywords

  • Evolution
  • Graph sandwich
  • Incomplete data
  • Perfect phylogeny

Fingerprint

Dive into the research topics of 'Incomplete directed perfect phylogeny'. Together they form a unique fingerprint.

Cite this