TY - GEN
T1 - Speech compression using wavelet packet and vector quantizer with 8-msec delay
AU - Averbuch, Amir
AU - Bobrovsky, B.
AU - Sheinin, V.
PY - 1995
Y1 - 1995
N2 - We present an algorithm for speech compression which uses the wavelet packet transform, vector quantization, entropy coding and postfiltering of the decoded speech. We address the following issue: obtaining the best speech quality for a given bit rate with minimal algorithmic delay (applying it on the possible shortest segment). The wavelet packet transform provides good compression since it is based on a very close relation between the transform and the actual physical processes in the human ear. The experimental results demonstrate that we can compress speech by factor of 6 - 10 and still have reasonable intelligibility and perceivability of the output speech using an algorithmic delay of 8 msec (64 speech samples). In addition, the proposed algorithm fits well DSP architecture and can be easily ported into any current 40MIPS DSP. By comparing the proposed algorithm in this paper with new CELP-oriented algorithm one can conclude that the former has less delay with higher compression ratio. The postfiltering was found to improve the quality of the decoded speech. We see that by using fixed size segments with 64 samples with wrap-around in the segments border does not degrade the performance in comparison to FIR-implementation without wrap-around. In addition, it is useful to implement different filter in each level of the decomposition.
AB - We present an algorithm for speech compression which uses the wavelet packet transform, vector quantization, entropy coding and postfiltering of the decoded speech. We address the following issue: obtaining the best speech quality for a given bit rate with minimal algorithmic delay (applying it on the possible shortest segment). The wavelet packet transform provides good compression since it is based on a very close relation between the transform and the actual physical processes in the human ear. The experimental results demonstrate that we can compress speech by factor of 6 - 10 and still have reasonable intelligibility and perceivability of the output speech using an algorithmic delay of 8 msec (64 speech samples). In addition, the proposed algorithm fits well DSP architecture and can be easily ported into any current 40MIPS DSP. By comparing the proposed algorithm in this paper with new CELP-oriented algorithm one can conclude that the former has less delay with higher compression ratio. The postfiltering was found to improve the quality of the decoded speech. We see that by using fixed size segments with 64 samples with wrap-around in the segments border does not degrade the performance in comparison to FIR-implementation without wrap-around. In addition, it is useful to implement different filter in each level of the decomposition.
UR - http://www.scopus.com/inward/record.url?scp=0029482968&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:0029482968
SN - 0819419281
SN - 9780819419286
T3 - Proceedings of SPIE - The International Society for Optical Engineering
SP - 320
EP - 332
BT - Proceedings of SPIE - The International Society for Optical Engineering
A2 - Laine, Andrew F.
A2 - Unser, Michael A.
A2 - Wickerhauser, Mladen V.
T2 - Wavelet Applications in Signal and Image Processing III. Part 1 (of 2)
Y2 - 12 July 1995 through 14 July 1995
ER -