Skew Gaussian mixture models for speaker recognition

Avi Matza*, Yuval Bistritz

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

The current paper proposes skew Gaussian mixture models for speaker recognition and an associated algorithm for its training from experimental data. Speaker identification experiments were conducted, in which speakers were modeled using the familiar Gaussian mixture models (GMM), and the new skew-GMM. Each model type was evaluated using two sets of feature vectors, the mel-frequency cepstral coefficients (MFCC), that are widely used in speaker recognition applications, and line spectra frequencies (LSF), that are used in many low bit rate speech coders but were not that successful in speech and speaker recognition. Results showed that the skew-GMM, with LSF, compares favorably with the GMM-MFCC pair (under fair comparison conditions). They indicate that skew-Gaussians are better suited for capturing the relatively highly non-symmetrical shapes of the LSF distribution. Thus the skew-GMM with LSF offers a worthy alternative to the GMM-MFCC pair for speaker recognition.

Original languageEnglish
Pages (from-to)5-8
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
StatePublished - 2011
Event12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy
Duration: 27 Aug 201131 Aug 2011

Keywords

  • Gaussian mixture models
  • Line spectral frequencies
  • Skew-Gaussians
  • Speaker recognition

Fingerprint

Dive into the research topics of 'Skew Gaussian mixture models for speaker recognition'. Together they form a unique fingerprint.

Cite this