A real-time, isolated-utterance speech recognizer for natural language with a 5,000-word vocabulary has been designed. It is based on the IBM Personal Computer AT model and two IBM signal processors realized in VLSI technology. The enrollment period for a new user is approximately 20 minutes. The basic vocabulary is chosen from the most common words in several collections of documents such as office memoranda and business letters. The system supports spelling and interactive personalization to augment this vocabulary. Signal processing, vector quantization, and acoustic matching algorithms are programmed on the signal processors, which fit into the PC AT chassis. The PC AT controls the processors and implements the decoder stack search and the language model, as well as the application-specific interface. The modular architecture of the design is expandable to a 20,000-word vocabulary system by the addition of two more signal processors housed in a PC expansion unit.
|Number of pages||4|
|Journal||Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing|
|State||Published - 1986|