Speaker indexing in audio archives using Gaussian mixture scoring simulation

Hagai Aronowitz*, David Burshtein, Amihood Amir

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

5 Scopus citations

Abstract

Speaker indexing has recently emerged as an important task due to the rapidly growing volume of audio archives. Current filtration techniques still suffer from problems both in accuracy and efficiency. In this paper an efficient method to simulate GMM scoring is presented. Simulation is done by fitting a GMM not only to every target speaker but also to every test utterance, and then computing the likelihood of the test call using these GMMs instead of using the original data. GMM simulation is used to achieve very efficient speaker indexing in terms of both search time and index size. Results on the SPIDRE and NIST-2004 speaker evaluation corpuses show that our approach maintains and sometimes exceeds the accuracy of the conventional GMM algorithm and achieves efficient indexing capabilities: 6000 times faster than a conventional GMM with 1% overhead in storage.

Original languageEnglish
Pages (from-to)243-252
Number of pages10
JournalLecture Notes in Computer Science
Volume3361
DOIs
StatePublished - 2005
EventFirst International Workshop on Machine Learning for Multimodal Interaction, MLMI 2004 - Martigny, Switzerland
Duration: 21 Jun 200423 Jun 2004

Fingerprint

Dive into the research topics of 'Speaker indexing in audio archives using Gaussian mixture scoring simulation'. Together they form a unique fingerprint.

Cite this