The COST 278 MASPER initiative - crosslingual speech recognition with large telephone databases
Andrej ˇgank (1), Zdravko Kačič (1), Frank Diehl (2), Klara Vicsi (3), Gyorgy Szaszak (3), Jozef Juhar (4), Slavomir Lihan (4)
(1) University of Maribor, Maribor, Slovenia; (2) Universitat Politecnica de Catalunya, Barcelona, Spain; (3) Budapest University of Technology and Economics, Budapest, Hungary; (4) Technical University of Kosice, Kosice, Slovakia
This paper presents the work on crosslingual speech recognition carried out by the MASPER initiative that was formed as a part of the COST 278 Action. Two different approaches for transfering monolingual source acoustic models to a new language were compared. The first one was expert-driven, based on the IPA scheme. The second was data-driven, based on a crosslingual phoneme confusion matrix. German, Spanish, Hungarian and Slovak were used as sourcelanguages. Slovenian was selected to be the target language. All experiments were carried out on SpeechDat databases. The results' analysis showed that the expert-driven method outperforms the data-driven one, and that similarities between source and target language have a significant influence on the performance.
crosslingual speech recognition, SpeechDat database, IPA scheme, phoneme confusion matrix, language similarities
|Language(s)||German, Hungarian, Slovak, Slovenian, Spanish|