TY - JOUR
T1 - Genetic code symmetry and efficient design of GC-constrained coding sequences
AU - Gavish, Matan
AU - Peled, Amnon
AU - Chor, Benny
PY - 2007
Y1 - 2007
N2 - Motivation: Cloning of long DNA sequences (40-60 bases) into phage display libraries using polymerase chain reaction (PCR) is a low efficiency process, in which PCR is used to incorporate a DNA insert, coding for a certain peptide, into the amplified sequence. The PCR efficiency in this process is strongly affected by the distribution of G-C bases in the amplified sequence. As any DNA insert coding for the target peptide may be attempted, there is a flexibility in choosing part of the amplified sequence. Since the number of inserts coding for the same peptide is exponential in the peptide length, a computational problem naturally arises - that of efficiently finding an insert, whose parameters are optimal for PCR cloning. Results: The GC distribution requirements are formulated as a search problem. We developed an efficient, linear time 'one pass' algorithm for this problem. Interestingly, our algorithm strongly relies on an interesting symmetry, which we observed in the standard genetic code. Most non-standard genetic codes examined possess this symmetry as well, yet some do not. We generalize the search problem and consider the case of a non-standard, or arbitrary, genetic code where this symmetry does not necessary hold. We solve the generalized problem in polynomial, but nonlinear, time.
AB - Motivation: Cloning of long DNA sequences (40-60 bases) into phage display libraries using polymerase chain reaction (PCR) is a low efficiency process, in which PCR is used to incorporate a DNA insert, coding for a certain peptide, into the amplified sequence. The PCR efficiency in this process is strongly affected by the distribution of G-C bases in the amplified sequence. As any DNA insert coding for the target peptide may be attempted, there is a flexibility in choosing part of the amplified sequence. Since the number of inserts coding for the same peptide is exponential in the peptide length, a computational problem naturally arises - that of efficiently finding an insert, whose parameters are optimal for PCR cloning. Results: The GC distribution requirements are formulated as a search problem. We developed an efficient, linear time 'one pass' algorithm for this problem. Interestingly, our algorithm strongly relies on an interesting symmetry, which we observed in the standard genetic code. Most non-standard genetic codes examined possess this symmetry as well, yet some do not. We generalize the search problem and consider the case of a non-standard, or arbitrary, genetic code where this symmetry does not necessary hold. We solve the generalized problem in polynomial, but nonlinear, time.
UR - http://www.scopus.com/inward/record.url?scp=33846685012&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btl317
DO - 10.1093/bioinformatics/btl317
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 17237106
AN - SCOPUS:33846685012
SN - 1367-4803
VL - 23
SP - e57-e63
JO - Bioinformatics
JF - Bioinformatics
IS - 2
ER -