Robust feature extraction for automatic recognition of vibrato singing in recorded polyphonic music

Felix Weninger*, Noam Amir, Ofer Amir, Irit Ronen, Florian Eyben, Bjorn Schuller

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

We address the robustness of features for fully automatic recognition of vibrato, which is usually defined as a periodic oscillation of the pitch (F0) of the singing voice, in recorded polyphonic music. Using an evaluation database covering jazz, pop and opera music, we show that the extraction of pitch is challenging in the presence of instrumental accompaniment, leading to unsatisfactory classification accuracy (61.1 %) if only the F0 frequency spectrum is used as features. To alleviate, we investigate alternative functionals of F0, alternative low-level features besides F0, and extraction of vocals by monaural source separation. Finally, we propose to use inter-quartile ranges of F0 delta regression coefficients as features which are highly robust against pitch extraction errors, reaching up to 86.9% accuracy in real-life conditions without any signal enhancement.

Original languageEnglish
Title of host publication2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
Pages85-88
Number of pages4
DOIs
StatePublished - 2012
Event2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Kyoto, Japan
Duration: 25 Mar 201230 Mar 2012

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
Country/TerritoryJapan
CityKyoto
Period25/03/1230/03/12

Keywords

  • Singing style
  • feature extraction
  • music signal processing

Fingerprint

Dive into the research topics of 'Robust feature extraction for automatic recognition of vibrato singing in recorded polyphonic music'. Together they form a unique fingerprint.

Cite this