TY - JOUR
T1 - Spectrum alignment
T2 - efficient resequencing by hybridization.
AU - Pe'er, I.
AU - Shamir, R.
PY - 2000
Y1 - 2000
N2 - Recent high-density microarray technologies allow, in principle, the determination of all k-mers that appear along a DNA sequence, for k = 8 - 10 in a single experiment on a standard chip. The k-mer contents, also called the spectrum of the sequence, is not sufficient to uniquely reconstruct a sequence longer than a few hundred bases. We have devised a polynomial algorithm that reconstructs the sequence, given the spectrum and a homologous sequence. This situation occurs, for example, in the identification of single nucleotide polymorphisms (SNPs), and whenever a homologue of the target sequence is known. The algorithm is robust, can handle errors in the spectrum and assumes no knowledge of the k-mer multiplicities. Our simulations show that with realistic levels of SNPs, the algorithm correctly reconstructs a target sequence of length up to 2,000 nucleotides when a polymorphic sequence is known. The technique is generalized to handle profiles and HMMs as input instead of a single homologous sequence.
AB - Recent high-density microarray technologies allow, in principle, the determination of all k-mers that appear along a DNA sequence, for k = 8 - 10 in a single experiment on a standard chip. The k-mer contents, also called the spectrum of the sequence, is not sufficient to uniquely reconstruct a sequence longer than a few hundred bases. We have devised a polynomial algorithm that reconstructs the sequence, given the spectrum and a homologous sequence. This situation occurs, for example, in the identification of single nucleotide polymorphisms (SNPs), and whenever a homologue of the target sequence is known. The algorithm is robust, can handle errors in the spectrum and assumes no knowledge of the k-mer multiplicities. Our simulations show that with realistic levels of SNPs, the algorithm correctly reconstructs a target sequence of length up to 2,000 nucleotides when a polymorphic sequence is known. The technique is generalized to handle profiles and HMMs as input instead of a single homologous sequence.
UR - http://www.scopus.com/inward/record.url?scp=0034564260&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:0034564260
SN - 1553-0833
VL - 8
SP - 260
EP - 268
JO - Proceedings / ... International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology
JF - Proceedings / ... International Conference on Intelligent Systems for Molecular Biology ; ISMB. International Conference on Intelligent Systems for Molecular Biology
ER -