Preliminary Evaluation of Slovenian Mobile Database PoliDat
Andrej Zgank (University of Maribor, Faculty of EE & CS)
Zdravko Kacic (University of Maribor, Faculty of EE & CS)
Bogomir Horvat (University of Maribor, Faculty of EE & CS)
The following paper describes the preliminary speech recognition evaluation of PoliDat database. This new database contains Slovenian speech captured over mobile telephones. The design of database is modeled according to the SpeechDat(II) specifications. The recording of speech material and the format of the database are shortly described. The speech recognition experiment is based on slightly modified COST 249 refrec0.96 script. Acoustic HMM speech models are trained on the fixed telephone Slovenian 1000 FDB SpeechDat(II) database. 40 speakers were taken from mobile PoliDat database, 20 for test set and 20 for adaptation set. First the signal to noise ratio of all recordings was calculated, then the speech recognition with unadapted acoustic models was performed. In the next step the retraining of acoustic models and maximum likelihood linear regression procedure were used for adaptation. In the last step, the adapted acoustic models were used for speech recognition with the PoliDat database. The adaptation procedures significantly improved the mobile speech recognition with fixed acoustic models. The overall word error rate decreased from 46.5% for unadapted models to 19.1% and 5.2% for adapted models.
Mobile database, Database evaluation, Acoustic adaptation