Title Information structure in the Potsdam Commentary Corpus: Topics
Authors Manfred Stede and Sara Mamprin
Abstract The Potsdam Commentary Corpus is a collection of 175 German newspaper commentaries annotated on a variety of different layers. This paper introduces a new layer that covers the linguistic notion of information-structural topic (not to be confused with `topic' as applied to documents in information retrieval). To our knowledge, this is the first larger topic-annotated resource for German (and one of the first for any language). We describe the annotation guidelines and the annotation process, and the results of an inter-annotator agreement study, which compare favourably to the related work. The annotated corpus is freely available for research.
Topics Corpus (Creation, Annotation, etc.), Discourse Annotation, Representation and Processing, Anaphora, Coreference
