TY - JOUR
T1 - A phonetic vocoder with adaptation to selectable speaker codebooks
AU - Halaly, Israel
AU - Bistritz, Yuval
PY - 2008
Y1 - 2008
N2 - The paper presents a very low bit rate phonetic vocoder based on speech recognition and synthesized speech with speaker adaptation using a set of speaker phoneme codebooks (SPCBs). The vocoder incorporates a well designed set of speaker phonemes codebooks that are available to both the encoder and decoder. The encoder performs periodically 'analysis by synthesis' that compares the incoming speech to speech that the decoder could synthesize from the output stream of the phoneme recognizer and the quantized pitch data per each SPCB and adapts it to the incoming speech by spectral warping. The index of the best performing SPCB and its adaptation parameter are transmitted to the decoder, together with the pitch and recognizer output bit streams, to synthesize speech that resembles better the speaker. In experiments held at a typical low bit rate of phonetic vocoders (below 300 bps), the incorporated adaptation reduced the average spectral distortion and increased speaker recognizability as judged by listeners. copyright by EURASIP.
AB - The paper presents a very low bit rate phonetic vocoder based on speech recognition and synthesized speech with speaker adaptation using a set of speaker phoneme codebooks (SPCBs). The vocoder incorporates a well designed set of speaker phonemes codebooks that are available to both the encoder and decoder. The encoder performs periodically 'analysis by synthesis' that compares the incoming speech to speech that the decoder could synthesize from the output stream of the phoneme recognizer and the quantized pitch data per each SPCB and adapts it to the incoming speech by spectral warping. The index of the best performing SPCB and its adaptation parameter are transmitted to the decoder, together with the pitch and recognizer output bit streams, to synthesize speech that resembles better the speaker. In experiments held at a typical low bit rate of phonetic vocoders (below 300 bps), the incorporated adaptation reduced the average spectral distortion and increased speaker recognizability as judged by listeners. copyright by EURASIP.
UR - http://www.scopus.com/inward/record.url?scp=84863766453&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.conferencearticle???
AN - SCOPUS:84863766453
SN - 2219-5491
JO - European Signal Processing Conference
JF - European Signal Processing Conference
T2 - 16th European Signal Processing Conference, EUSIPCO 2008
Y2 - 25 August 2008 through 29 August 2008
ER -