TY - GEN

T1 - A faster algorithm for RNA co-folding

AU - Ziv-Ukelson, Michal

AU - Gat-Viks, Irit

AU - Wexler, Ydo

AU - Shamir, Ron

PY - 2008

Y1 - 2008

N2 - The current pairwise RNA (secondary) structural alignment algorithms are based on Sankoff's dynamic programming algorithm from 1985. Sankoff's algorithm requires O(N 6) time and O(N 4) space, where N denotes the length of the compared sequences, and thus its applicability is very limited. The current literature offers many heuristics for speeding up Sankoff's alignment process, some making restrictive assumptions on the length or the shape of the RNA substructures. We show how to speed up Sankoff's algorithm in practice via non-heuristic methods, without compromising optimality. Our analysis shows that the expected time complexity of the new algorithm is O(N 4 ζ(N)), where ζ(N) converges to O(N), assuming a standard polymer folding model which was supported by experimental analysis. Hence our algorithm speeds up Sankoff's algorithm by a linear factor on average. In simulations, our algorithm speeds up computation by a factor of 3-12 for sequences of length 25-250. Availability: Code and data sets are available, upon request.

AB - The current pairwise RNA (secondary) structural alignment algorithms are based on Sankoff's dynamic programming algorithm from 1985. Sankoff's algorithm requires O(N 6) time and O(N 4) space, where N denotes the length of the compared sequences, and thus its applicability is very limited. The current literature offers many heuristics for speeding up Sankoff's alignment process, some making restrictive assumptions on the length or the shape of the RNA substructures. We show how to speed up Sankoff's algorithm in practice via non-heuristic methods, without compromising optimality. Our analysis shows that the expected time complexity of the new algorithm is O(N 4 ζ(N)), where ζ(N) converges to O(N), assuming a standard polymer folding model which was supported by experimental analysis. Hence our algorithm speeds up Sankoff's algorithm by a linear factor on average. In simulations, our algorithm speeds up computation by a factor of 3-12 for sequences of length 25-250. Availability: Code and data sets are available, upon request.

UR - http://www.scopus.com/inward/record.url?scp=56649106364&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-87361-7_15

DO - 10.1007/978-3-540-87361-7_15

M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???

AN - SCOPUS:56649106364

SN - 3540873600

SN - 9783540873600

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 174

EP - 185

BT - Algorithms in Bioinformatics - 8th International Workshop, WABI 2008, Proceedings

T2 - 8th International Workshop on Algorithms in Bioinformatics, WABI 2008

Y2 - 15 September 2008 through 19 September 2008

ER -