A Fine-Grained Evaluation Method for Speech-to-Speech Machine Translation Using Concept Annotations


Robert S. Belvin (1), Susanne Riehemann (2), Kristin Precoda (2)

(1) HRL Laboratories LLC; (2) SRI International




In this paper we report on a method of evaluating spoken language translation systems that builds upon a task-based evaluation method developed by CMU, but rather than relying on a predefined database of Interchange Format representations of spoken utterances, instead relies on a set of explicitly defined conventions for creating these interlingual representations. Our method also departs from CMU's in its scoring conventions in using a finer-grained approach to scoring (especially scoring of predicates). We have attempted to validate the legitimacy of this approach to speech-to-speech MT evaluation by looking for a relationship between the scores generated by this method, and the scores generated by a series of experiments using na•ve human judgements of the meaning and quality of MT systems' output.


Machine Translation, MT Evaluation, Concept Annotation, Interlingual, Speech-to-speech MT, Human Judgements, Cross-system comparison, Cross-language comparison

Language(s) English
Full Paper