Developments in the TIGER Annotation Scheme and their Realization in the Corpus
Sabine Brants (Computational Linguistics, Saarland University Postfach 151150, 66041 Saarbruecken, Germany)
Silvia Hansen (Computational Linguistics, Saarland University Postfach 151150, 66041 Saarbruecken, Germany)
WP4: Corpus Annotation
This paper presents the annotation of the German TIGER Treebank. First, issues concerning the annotation, representation as well as querying of the treebank are discussed. Within this context, the annotation tool ANNOTATE, the export and XML formats of the TIGER Treebank and the TIGER search tool are briefly introduced. Secondly, the developments of the TIGER annotation scheme and their realization in the corpus are introduced focussing on the differences between the underlying NEGRA annotation scheme and the further developed TIGER annotation scheme. The main differences are concerned with verb-subcategorization, coordination, appositions and parentheses as well as proper nouns. Thirdly, the annotation scheme is assessed through an evaluation and a problem discussion of the above mentioned changes. For this purpose, inter-annotator agreement in the TIGER project has been analyzed focussing on exactly these changes. This analysis shows where the annotators' decision problems are. These difficulties are discussed in greater detail on the basis of annotation examples. The paper concludes with some suggestions for the improvement of the TIGER annotation scheme.
Corpus linguistics, German treebank, Annotation, Guidelines, Consistency