Evaluation of Pronunciation Variants in the ASR Lexicon for Different Speaking Styles
Ingunn Amdal (Department of Telecommunications Norwegian University of Science and Technology N-7491 Trondheim, Norway)
Torbjørn Svendsen (Department of Telecommunications Norwegian University of Science and Technology N-7491 Trondheim, Norway)
SO6: Phonetic Lexicons
One of the challenges in automatic speech recognition is how to handle pronunciation variation. The main causes for pronunciation variation are the speaker (voice characteristics, accent, non-nativeness etc.) and the speaking style (reading, spontaneous responses, conversation etc.). An ASR system has basically two options for modelling the variation on the word and sub-word level: lexical modelling of the pronunciation variation or adaptation, i.e. re-training of the acoustic models. The answer to the question of which technique to choose, or how to combine them, may depend on the speaking style. We have therefore investigated the effects of using pronunciation variants for recognition of read speech, spontaneous dictation, and non-native speech. The variants in the standard purpose lexicon tested gave modest improvements and best results for read speech, which is the speaking style of the acoustic model training set.
Speech recognition, Lexical modelling, Pronunciation variation, Non-Native speech, Spontaneous speech