TY - GEN
T1 - The effect of pitch, intensity and pause duration in punctuation detection
AU - Levy, Tal
AU - Silber-Varod, Vered
AU - Moyal, Ami
PY - 2012
Y1 - 2012
N2 - The purpose of this research is to automatically detect punctuation in speech using only prosodic cues. We aim to integrate prosodic elements such as pauses, changes in f0 and amplitude range, into an Automatic Speech Recognition engine in order to generate punctuation for read speech, without taking the context of the sentences into consideration. We trained acoustic models of the prosodic features of two Punctuation Marks (PMs): full-stop and comma, which we assume have distinct prosodic characteristics. A Neural Network was used to estimate the weights assigned to each prosodic feature that corresponds to a particular PM, later to be used by a PM classifier. Results show that 87% of full-stops were detected, with only 14% false alarms. Nevertheless, since most commas are realized with no pitch breaks, only 54% of the commas were detected, with 35% false alarms. Our results support the hypothesis that acoustic-prosodic cues provide useful evidence about phrases.
AB - The purpose of this research is to automatically detect punctuation in speech using only prosodic cues. We aim to integrate prosodic elements such as pauses, changes in f0 and amplitude range, into an Automatic Speech Recognition engine in order to generate punctuation for read speech, without taking the context of the sentences into consideration. We trained acoustic models of the prosodic features of two Punctuation Marks (PMs): full-stop and comma, which we assume have distinct prosodic characteristics. A Neural Network was used to estimate the weights assigned to each prosodic feature that corresponds to a particular PM, later to be used by a PM classifier. Results show that 87% of full-stops were detected, with only 14% false alarms. Nevertheless, since most commas are realized with no pitch breaks, only 54% of the commas were detected, with 35% false alarms. Our results support the hypothesis that acoustic-prosodic cues provide useful evidence about phrases.
UR - https://www.scopus.com/pages/publications/84871993382
U2 - 10.1109/EEEI.2012.6376934
DO - 10.1109/EEEI.2012.6376934
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84871993382
SN - 9781467346801
T3 - 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2012
BT - 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2012
T2 - 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2012
Y2 - 14 November 2012 through 17 November 2012
ER -