Diagnostic Assessment of Telephone Transmission Impact on ASR Performance and Human-to-Human Speech Quality
Sebastian Möller (Inst. of Communication Acoustics, Ruhr-University Bochum, D-44780 Bochum, Germany)
Ergina Kavallieratou (Wire Communications Lab, University of Patras, GR-26500 Patras, Greece)
SO5: Speech Variabilities & Multilingual ASR
This paper addresses the transmission channel impact on human-to-human speech communication quality as well as on ASR performance. Transmission channels include standard wireline or mobile telephone networks and IP-based networks, which can be operated via different types of user interfaces. In order to gain control over the transmission channel, a simulation model is developed. It implements all types of stationary impairments which can be found in the mentioned networks. Human-to-human speech communication quality in these situations is estimated using a network planning model. Experiments are carried out for assessing ASR performance over the same channel, with three different types of recognizers: two prototypical recognizers used in a telephone-based information server, and a standardized set-up developed under the AURORA framework for distributed ASR. It turns out that some interesting differences exist in behavior between the ASR system performance and speech quality in human-to-human communication. The differences should be taken into account by both developers of ASR systems and transmission network planners.
Telephone speech, ASR performance, Diagnostic evaluation, Quality prediction, Simulation tools