TY - JOUR
T1 - VOMBAT
T2 - Prediction of transcription factor binding sites using variable order Bayesian trees
AU - Grau, Jan
AU - Ben-Gal, Irad
AU - Posch, Stefan
AU - Grosse, Ivo
N1 - Funding Information:
We thank AndréGohr for implementing many of the algorithms, Martin Oertel for valuable discussions, and the German Ministry of Education and Research (BMBF Grant No. 0312706A/D) for financial support. Funding to pay the Open Access publication charges for this article was provided by IPK Gatersleben.
PY - 2006/7
Y1 - 2006/7
N2 - Variable order Markov models and variable order Bayesian trees have been proposed for the recognition of transcription factor binding sites, and it could be demonstrated that they outperform traditional models, such as position weight matrices, Markov models and Bayesian trees. We develop a web server for the recognition of DNA binding sites based on variable order Markov models and variable order Bayesian trees offering the following functionality: (i) given datasets with annotated binding sites and genomic background sequences, variable order Markov models and variable order Bayesian trees can be trained; (ii) given a set of trained models, putative DNA binding sites can be predicted in a given set of genomic sequences and (iii) given a dataset with annotated binding sites and a dataset with genomic background sequences, cross-validation experiments for different model combinations with different parameter settings can be performed. Several of the offered services are computationally demanding, such as genome-wide predictions of DNA binding sites in mammalian genomes or sets of 104-fold cross-validation experiments for different model combinations based on problem-specific data sets. In order to execute these jobs, and in order to serve multiple users at the same time, the web server is attached to a Linux cluster with 150 processors. VOMBAT is available at http://pdw-24.ipk-gatersleben.de:8080/VOMBAT/.
AB - Variable order Markov models and variable order Bayesian trees have been proposed for the recognition of transcription factor binding sites, and it could be demonstrated that they outperform traditional models, such as position weight matrices, Markov models and Bayesian trees. We develop a web server for the recognition of DNA binding sites based on variable order Markov models and variable order Bayesian trees offering the following functionality: (i) given datasets with annotated binding sites and genomic background sequences, variable order Markov models and variable order Bayesian trees can be trained; (ii) given a set of trained models, putative DNA binding sites can be predicted in a given set of genomic sequences and (iii) given a dataset with annotated binding sites and a dataset with genomic background sequences, cross-validation experiments for different model combinations with different parameter settings can be performed. Several of the offered services are computationally demanding, such as genome-wide predictions of DNA binding sites in mammalian genomes or sets of 104-fold cross-validation experiments for different model combinations based on problem-specific data sets. In order to execute these jobs, and in order to serve multiple users at the same time, the web server is attached to a Linux cluster with 150 processors. VOMBAT is available at http://pdw-24.ipk-gatersleben.de:8080/VOMBAT/.
UR - http://www.scopus.com/inward/record.url?scp=33747817698&partnerID=8YFLogxK
U2 - 10.1093/nar/gkl212
DO - 10.1093/nar/gkl212
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:33747817698
VL - 34
SP - W529-W533
JO - Nucleic Acids Research
JF - Nucleic Acids Research
SN - 0305-1048
IS - WEB. SERV. ISS.
ER -