TY - JOUR
T1 - QuasiMotiFinder
T2 - Protein annotation by searching for evolutionarily conserved motif-like patterns
AU - Gutman, Roee
AU - Berezin, Carine
AU - Wollman, Roy
AU - Rosenberg, Yossi
AU - Ben-Tal, Nir
N1 - Funding Information:
We are grateful to Allegra Via, Burkhard Rost and David Steinberg for helpful discussions, and to Eyal Privman and the Bioinformatics Service Unit at the George S. Wise Faculty of Life Sciences at Tel Aviv University for providing technical assistance and computational facilities. This work was supported by the European Commission FP6 Integrated Project EUROHEAR, LSHG-CT-20054-512063. Funding to pay the Open Access publication charges for this article was provided by N.B-T., Tel Aviv University.
PY - 2005/7
Y1 - 2005/7
N2 - Sequence signature databases such as PROSITE, which include amino acid segments that are indicative of a protein's function, are useful for protein annotation. Lamentably, the annotation is not always accurate. A signature may be falsely detected in a protein that does not carry out the associated function (false positive prediction, FP) or may be overlooked in a protein that does carry out the function (false negative prediction, FN). A new approach has emerged in which a signature is replaced with a sequence profile, calculated based on multiple sequence alignment (MSA) of homologous proteins that share the same function. This approach, which is superior to the simple pattern search, essentially searches with the sequence of the query protein against an MSA library. We suggest here an alternative approach, implemented in the QuasiMotiFinder web server (http://quasimotifinder.tau.ac.il/), which is based on a search with an MSA of homologous query proteins against the original PROSITE signatures. The explicit use of the average evolutionary conservation of the signature in the query proteins significantly reduces the rate of FP prediction compared with the simple pattern search. QuasiMotiFinder also has a reduced rate of FN prediction compared with simple pattern searches, since the traditional search for precise signatures has been replaced by a permissive search for signature-like patterns that are physicochemically similar to known signatures. Overall, QuasiMotiFinder and the profile search are comparable to each other in terms of performance. They are also complementary to each other in that signatures that are falsely detected in (or overlooked by) one may be correctly detected by the other.
AB - Sequence signature databases such as PROSITE, which include amino acid segments that are indicative of a protein's function, are useful for protein annotation. Lamentably, the annotation is not always accurate. A signature may be falsely detected in a protein that does not carry out the associated function (false positive prediction, FP) or may be overlooked in a protein that does carry out the function (false negative prediction, FN). A new approach has emerged in which a signature is replaced with a sequence profile, calculated based on multiple sequence alignment (MSA) of homologous proteins that share the same function. This approach, which is superior to the simple pattern search, essentially searches with the sequence of the query protein against an MSA library. We suggest here an alternative approach, implemented in the QuasiMotiFinder web server (http://quasimotifinder.tau.ac.il/), which is based on a search with an MSA of homologous query proteins against the original PROSITE signatures. The explicit use of the average evolutionary conservation of the signature in the query proteins significantly reduces the rate of FP prediction compared with the simple pattern search. QuasiMotiFinder also has a reduced rate of FN prediction compared with simple pattern searches, since the traditional search for precise signatures has been replaced by a permissive search for signature-like patterns that are physicochemically similar to known signatures. Overall, QuasiMotiFinder and the profile search are comparable to each other in terms of performance. They are also complementary to each other in that signatures that are falsely detected in (or overlooked by) one may be correctly detected by the other.
UR - http://www.scopus.com/inward/record.url?scp=23144445962&partnerID=8YFLogxK
U2 - 10.1093/nar/gki496
DO - 10.1093/nar/gki496
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:23144445962
SN - 0305-1048
VL - 33
SP - W255-W261
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - SUPPL. 2
ER -