Abstract
The current paper proposes skew Gaussian mixture models for speaker recognition and an associated algorithm for its training from experimental data. Speaker identification experiments were conducted, in which speakers were modeled using the familiar Gaussian mixture models (GMM), and the new skew-GMM. Each model type was evaluated using two sets of feature vectors, the mel-frequency cepstral coefficients (MFCC), that are widely used in speaker recognition applications, and line spectra frequencies (LSF), that are used in many low bit rate speech coders but were not that successful in speech and speaker recognition. Results showed that the skew-GMM, with LSF, compares favorably with the GMM-MFCC pair (under fair comparison conditions). They indicate that skew-Gaussians are better suited for capturing the relatively highly non-symmetrical shapes of the LSF distribution. Thus the skew-GMM with LSF offers a worthy alternative to the GMM-MFCC pair for speaker recognition.
Original language | English |
---|---|
Pages (from-to) | 5-8 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
State | Published - 2011 |
Event | 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy Duration: 27 Aug 2011 → 31 Aug 2011 |
Keywords
- Gaussian mixture models
- Line spectral frequencies
- Skew-Gaussians
- Speaker recognition