Genetic code symmetry and efficient design of GC-constrained coding sequences

Matan Gavish, Amnon Peled, Benny Chor*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Motivation: Cloning of long DNA sequences (40-60 bases) into phage display libraries using polymerase chain reaction (PCR) is a low efficiency process, in which PCR is used to incorporate a DNA insert, coding for a certain peptide, into the amplified sequence. The PCR efficiency in this process is strongly affected by the distribution of G-C bases in the amplified sequence. As any DNA insert coding for the target peptide may be attempted, there is a flexibility in choosing part of the amplified sequence. Since the number of inserts coding for the same peptide is exponential in the peptide length, a computational problem naturally arises - that of efficiently finding an insert, whose parameters are optimal for PCR cloning. Results: The GC distribution requirements are formulated as a search problem. We developed an efficient, linear time 'one pass' algorithm for this problem. Interestingly, our algorithm strongly relies on an interesting symmetry, which we observed in the standard genetic code. Most non-standard genetic codes examined possess this symmetry as well, yet some do not. We generalize the search problem and consider the case of a non-standard, or arbitrary, genetic code where this symmetry does not necessary hold. We solve the generalized problem in polynomial, but nonlinear, time.

Original languageEnglish
Pages (from-to)e57-e63
Issue number2
StatePublished - 2007


Dive into the research topics of 'Genetic code symmetry and efficient design of GC-constrained coding sequences'. Together they form a unique fingerprint.

Cite this