LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title What are Transcription Errors and Why are They made?
Authors Oppermann Daniela (Institute of Phonetics and Speech Communication, Schellingstr. 3,80799 Munich, Germany, daniela.oppermann@phonetik.uni-muenchen.de)
Burger Susanne (Interactive Systems Laboratories, Carnegie Mellon Univeristy Pittsburgh, USA, University of Karlsruhe, Germany, sburger@cs.cmu.edu)
Weilhammer Karl (Institute of Phonetics and Speech Communication, Schellingstr. 3,80799 Munich, Germany, karl.weilhammer@phonetik.uni-muenchen.de)
Keywords Annotation Errors, Data-Collection, Spontaneous Speech, Transcription
Session Session SP2 - Spoken Language Resources Issues from Construction to Validation
Full Paper 205.ps, 205.pdf
Abstract In recent work we compared transcriptions of German spontaneous dialogues of the VERBMOBIL corpus to ascertain differences between transcribers and quality. A better understanding of where and what kind of inconsistencies occur will help us to improve the working environment for transcribers, to reduce the effort on correction passes, and will finally result in better transcription quality. The results show that transcribers have different levels of perception of spontaneous speech phenomena, mainly prosodic phenomena such as pauses in speech and lengthening. During the correction pass 80% of these labels had to be inserted. Additionally, the annotation of non-grammatical phrases and pronunciation comments seems to need a better explanation in the convention manual. Here the correcting transcribers had to change 20% of the annotations.