TY - JOUR
T1 - Haplotype reconstruction using perfect phylogeny and sequence data.
AU - Efros, Anatoly
AU - Halperin, Eran
N1 - Funding Information:
E.H. is a faculty fellow of the Edmond J. Safra Bioinformatics program at Tel-Aviv University. A.E. was supported by the Israel Science Foundation grant no. 04514831, and by the IBM Open Collaborative Research Program. This article has been published as part of BMC Bioinformatics Volume 13 Supplement 6, 2012: Proceedings of the Second Annual RECOMB Satellite Workshop on Massively Parallel Sequencing (RECOMB-seq 2012). The full contents of the supplement are available online at http://www. biomedcentral.com/bmcbioinformatics/supplements/13/S6.
PY - 2012
Y1 - 2012
N2 - Haplotype phasing is a well studied problem in the context of genotype data. With the recent developments in high-throughput sequencing, new algorithms are needed for haplotype phasing, when the number of samples sequenced is low and when the sequencing coverage is blow. High-throughput sequencing technologies enables new possibilities for the inference of haplotypes. Since each read is originated from a single chromosome, all the variant sites it covers must derive from the same haplotype. Moreover, the sequencing process yields much higher SNP density than previous methods, resulting in a higher correlation between neighboring SNPs. We offer a new approach for haplotype phasing, which leverages on these two properties. Our suggested algorithm, called Perfect Phlogeny Haplotypes from Sequencing (PPHS) uses a perfect phylogeny model and it models the sequencing errors explicitly. We evaluated our method on real and simulated data, and we demonstrate that the algorithm outperforms previous methods when the sequencing error rate is high or when coverage is low.
AB - Haplotype phasing is a well studied problem in the context of genotype data. With the recent developments in high-throughput sequencing, new algorithms are needed for haplotype phasing, when the number of samples sequenced is low and when the sequencing coverage is blow. High-throughput sequencing technologies enables new possibilities for the inference of haplotypes. Since each read is originated from a single chromosome, all the variant sites it covers must derive from the same haplotype. Moreover, the sequencing process yields much higher SNP density than previous methods, resulting in a higher correlation between neighboring SNPs. We offer a new approach for haplotype phasing, which leverages on these two properties. Our suggested algorithm, called Perfect Phlogeny Haplotypes from Sequencing (PPHS) uses a perfect phylogeny model and it models the sequencing errors explicitly. We evaluated our method on real and simulated data, and we demonstrate that the algorithm outperforms previous methods when the sequencing error rate is high or when coverage is low.
UR - http://www.scopus.com/inward/record.url?scp=84867852195&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-13-S6-S3
DO - 10.1186/1471-2105-13-S6-S3
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 22537042
AN - SCOPUS:84867852195
VL - 13 Suppl 6
SP - S3
JO - BMC Bioinformatics
JF - BMC Bioinformatics
SN - 1471-2105
M1 - S3
ER -