Efficient reconstruction of haplotype structure via perfect phylogeny.

Eleazar Eskin*, Eran Halperin, Richard M. Karp

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

85 Scopus citations

Abstract

Each person's genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person's genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a haplotype. The determination of the haplotypes within a population is essential for understanding genetic variation and the inheritance of complex diseases. The haplotype mapping project, a successor to the human genome project, seeks to determine the common haplotypes in the human population. Since experimental determination of a person's genotype is less expensive than determining its component haplotypes, algorithms are required for computing haplotypes from genotypes. Two observations aid in this process: first, the human genome contains short blocks within which only a few different haplotypes occur; second, as suggested by Gusfield, it is reasonable to assume that the haplotypes observed within a block have evolved according to a perfect phylogeny, in which at most one mutation event has occurred at any site, and no recombination occurred at the given region. We present a simple and efficient polynomial-time algorithm for inferring haplotypes from the genotypes of a set of individuals assuming a perfect phylogeny. Using a reduction to 2-SAT we extend this algorithm to handle constraints that apply when we have genotypes from both parents and child. We also present a hardness result for the problem of removing the minimum number of individuals from a population to ensure that the genotypes of the remaining individuals are consistent with a perfect phylogeny. Our algorithms have been tested on real data and give biologically meaningful results. Our webserver (http://www.cs.columbia.edu/compbio/hap/) is publicly available for predicting haplotypes from genotype data and partitioning genotype data into blocks.

Original languageEnglish
Pages (from-to)1-20
Number of pages20
JournalJournal of Bioinformatics and Computational Biology
Volume1
Issue number1
DOIs
StatePublished - Apr 2003
Externally publishedYes

Fingerprint

Dive into the research topics of 'Efficient reconstruction of haplotype structure via perfect phylogeny.'. Together they form a unique fingerprint.

Cite this