Motif extraction and protein classification

Vered Kunik*, Zach Solan, Shimon Edelman, Eytan Ruppin, David Horn

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We present a novel unsupervised method for extracting meaningful motifs from biological sequence data. This de novo motif extraction (MEX) algorithm is data driven, finding motifs that are not necessarily over-represented in the data. Applying MEX to the oxidoreductases class of enzymes, containing approximately 7000 enzyme sequences, a relatively small set of motifs is obtained. This set spans a motif-space that is used for functional classification of the enzymes by an SVM classifier. The classification based on MEX motifs surpasses that of two other SVM based methods: SVMProt, a method based on the analysis of physical-chemical properties of a protein generated from its sequence of amino acids, and SVM applied to a Smith-Waterman distances matrix. Our findings demonstrate that the MEX algorithm extracts relevant motifs, supporting a successful sequence-to-function classification.

Original languageEnglish
Title of host publicationProceedings - 2005 IEEE Computational SystemsBioinformatics Conference, CSB 2005
Pages80-85
Number of pages6
DOIs
StatePublished - 2005
Event2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005 - Stanford, CA, United States
Duration: 8 Aug 200511 Aug 2005

Publication series

NameProceedings - 2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005
Volume2005

Conference

Conference2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005
Country/TerritoryUnited States
CityStanford, CA
Period8/08/0511/08/05

Keywords

  • Enzyme classification
  • Motif extraction

Fingerprint

Dive into the research topics of 'Motif extraction and protein classification'. Together they form a unique fingerprint.

Cite this