Motif extraction and protein classification

Vered Kunik*, Zach Solan, Shimon Edelman, Eytan Ruppin, David Horn

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

27 Scopus citations


We present a novel unsupervised method for extracting meaningful motifs from biological sequence data. This de novo motif extraction (MEX) algorithm is data driven, finding motifs that are not necessarily over-represented in the data. Applying MEX to the oxidoreductases class of enzymes, containing approximately 7000 enzyme sequences, a relatively small set of motifs is obtained. This set spans a motif-space that is used for functional classification of the enzymes by an SVM classifier. The classification based on MEX motifs surpasses that of two other SVM based methods: SVMProt, a method based on the analysis of physical-chemical properties of a protein generated from its sequence of amino acids, and SVM applied to a Smith-Waterman distances matrix. Our findings demonstrate that the MEX algorithm extracts relevant motifs, supporting a successful sequence-to-function classification.

Original languageEnglish
Title of host publicationProceedings - 2005 IEEE Computational SystemsBioinformatics Conference, CSB 2005
Number of pages6
StatePublished - 2005
Event2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005 - Stanford, CA, United States
Duration: 8 Aug 200511 Aug 2005

Publication series

NameProceedings - 2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005


Conference2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005
Country/TerritoryUnited States
CityStanford, CA


  • Enzyme classification
  • Motif extraction


Dive into the research topics of 'Motif extraction and protein classification'. Together they form a unique fingerprint.

Cite this