There are several algorithms designed for searches for homologous sequences (Fitch 1966; Needleman and Wunsch 1970; Chva'tal and Sankoff 1975; Griggs 1977; Sannkoff 1972; Smith and Waterman 1981; Smith et al. 1981, Wagner and Fisher 1974; Waterman et al. 1976). This paper presents some very simple and useful high speed, "text editing" algorithms that search for exact nucleotide sequence repetition and genome duplication. The last algorithm suggested here is specifically adapted for the 4-letter alphabet of nucleotide sequences. Owing to the rapid accumulation of nucleotide sequences and the frequent need to search for sequence repetition or where a given set of nucleotides occurs in long sequences, efficient algorithms of this type are a necessity.
- Exact repetition
- Sequence analysis