TY - JOUR
T1 - A sequence-based filtering method for ncRNA identification and its application to searching for riboswitch elements
AU - Zhang, Shaojie
AU - Borovok, Ilya
AU - Aharonowitz, Yair
AU - Sharan, Roded
AU - Bafna, Vineet
N1 - Funding Information:
This work is supported by a grant from the National Science Foundation (NSF-DBI:0516440) (S.Z. and V.B) and by an Alon Fellowship (R.S.). This research is also supported in part by the UCSD FWGrid Project (NSF Research Infrastructure Grant Number EIA-0303622).
PY - 2006/7/15
Y1 - 2006/7/15
N2 - Motivation: Recent studies have uncovered an "RNA world", in which non coding RNA (ncRNA) sequences play a central role in the regulation of gene expression. Computational studies on ncRNA have been directed toward developing detection methods for ncRNAs. State-of-the-art methods for the problem, like covariance models, suffer from high computational cost, underscoring the need for efficient filtering approaches that can identify promising sequence segments and speedup the detection process. Results: In this paper we make several contributions toward this goal. First, we formalize the concept of a filter and provide figures of merit that allow comparison between filters. Second, we design efficient sequence based filters that dominate the current state-of-the-art HMM filters. Third, we provide a new formulation of the covariance model that allows speeding up RNA alignment. We demonstrate the power of our approach on both synthetic data and real bacterial genomes. We then apply our algorithm to the detection of novel riboswitch elements from the whole bacterial and archaeal genomes. Our results point to a number of novel riboswitch candidates, and include genomes that were not previously known to contain riboswitches.
AB - Motivation: Recent studies have uncovered an "RNA world", in which non coding RNA (ncRNA) sequences play a central role in the regulation of gene expression. Computational studies on ncRNA have been directed toward developing detection methods for ncRNAs. State-of-the-art methods for the problem, like covariance models, suffer from high computational cost, underscoring the need for efficient filtering approaches that can identify promising sequence segments and speedup the detection process. Results: In this paper we make several contributions toward this goal. First, we formalize the concept of a filter and provide figures of merit that allow comparison between filters. Second, we design efficient sequence based filters that dominate the current state-of-the-art HMM filters. Third, we provide a new formulation of the covariance model that allows speeding up RNA alignment. We demonstrate the power of our approach on both synthetic data and real bacterial genomes. We then apply our algorithm to the detection of novel riboswitch elements from the whole bacterial and archaeal genomes. Our results point to a number of novel riboswitch candidates, and include genomes that were not previously known to contain riboswitches.
UR - http://www.scopus.com/inward/record.url?scp=33747872849&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btl232
DO - 10.1093/bioinformatics/btl232
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:33747872849
SN - 1367-4803
VL - 22
SP - e557-e565
JO - Bioinformatics
JF - Bioinformatics
IS - 14
ER -