TY - JOUR
T1 - Speaker indexing in audio archives using Gaussian mixture scoring simulation
AU - Aronowitz, Hagai
AU - Burshtein, David
AU - Amir, Amihood
PY - 2005
Y1 - 2005
N2 - Speaker indexing has recently emerged as an important task due to the rapidly growing volume of audio archives. Current filtration techniques still suffer from problems both in accuracy and efficiency. In this paper an efficient method to simulate GMM scoring is presented. Simulation is done by fitting a GMM not only to every target speaker but also to every test utterance, and then computing the likelihood of the test call using these GMMs instead of using the original data. GMM simulation is used to achieve very efficient speaker indexing in terms of both search time and index size. Results on the SPIDRE and NIST-2004 speaker evaluation corpuses show that our approach maintains and sometimes exceeds the accuracy of the conventional GMM algorithm and achieves efficient indexing capabilities: 6000 times faster than a conventional GMM with 1% overhead in storage.
AB - Speaker indexing has recently emerged as an important task due to the rapidly growing volume of audio archives. Current filtration techniques still suffer from problems both in accuracy and efficiency. In this paper an efficient method to simulate GMM scoring is presented. Simulation is done by fitting a GMM not only to every target speaker but also to every test utterance, and then computing the likelihood of the test call using these GMMs instead of using the original data. GMM simulation is used to achieve very efficient speaker indexing in terms of both search time and index size. Results on the SPIDRE and NIST-2004 speaker evaluation corpuses show that our approach maintains and sometimes exceeds the accuracy of the conventional GMM algorithm and achieves efficient indexing capabilities: 6000 times faster than a conventional GMM with 1% overhead in storage.
UR - http://www.scopus.com/inward/record.url?scp=24144462539&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-30568-2_21
DO - 10.1007/978-3-540-30568-2_21
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.conferencearticle???
AN - SCOPUS:24144462539
SN - 0302-9743
VL - 3361
SP - 243
EP - 252
JO - Lecture Notes in Computer Science
JF - Lecture Notes in Computer Science
T2 - First International Workshop on Machine Learning for Multimodal Interaction, MLMI 2004
Y2 - 21 June 2004 through 23 June 2004
ER -