Acoustic Modeling and Training of a Bilingual ASR System when a Minority Language is Involved
Laura Docio-Fernandez (Departamento de Teoria de la Seņal y Comunicaciones E.T.S.I. Telecomunicacion Campus Universitario de Vigo 36200 VIGO,SPAIN)
Carmen Garcia-Mateo (Departamento de Teoria de la Seņal y Comunicaciones E.T.S.I. Telecomunicacion Campus Universitario de Vigo 36200 VIGO,SPAIN)
SP2: Speech Varieties And Multilingual ASR
This paper describes our work in developing a bilingual speech recognition system using two SpeechDat databases. The bilingual aspect of this work is of particular importance in the Galician region of Spain where both languages Galician and Spanish coexist and one of the languages, the Galician one, is a minority language. Based on a global Spanish-Galician phoneme set we built a bilingual speech recognition system which can handle both languages: Spanish and Galician. The recognizer makes use of context dependent acoustic models based on continuous density hidden Markov models. The system has been evaluated on a isolated-word large-vocabulary task. The tests show that Spanish system exhibits a better performance than the Galician system due to its better training. The bilingual system provides an equivalent performance to that achieved by the language specific systems.
Minority languages, Multilingual ASR systems