Title The Penn Discourse Treebank
Author(s) Eleni Miltsakaki (1), Rashmi Prasad (1), Aravind Joshi (1), Bonnie Webber (2)

(1) University of Pennsylvania; (2) University of Edinburgh

Session O47-W
Abstract This paper describes a new discourse-level annotation project -- the Penn Discourse Treebank (PDTB) -- that aims to produce a large-scale corpus in which discourse connectives are annotated, along with their arguments, thus exposing a clearly defined level of discourse structure. The PDTB is being built directly on top of the Penn Treebank and Propbank, thus supporting the extraction of useful syntactic and semantic features and providing a richer substrate for the development and evaluation of practical algorithms. We present a preliminary analysis of inter-annotator agreement -- both the level of agreement and the types of inter-annotator variation.
Keyword(s) Corpus annotation, discourse connectives, Penn Discourse Treebank
Language(s) English
Full Paper 618.pdf