Utilization of Multiple Language Resources for Robust Grammar-Based Tense and Aspect Classification


Alexis Palmer, Jonas Kuhn, Carlota Smith

The University of Texas at Austin, Department of Linguistics




This paper reports on an ongoing project that uses varied language resources and advanced NLP tools for a linguistic classification task in discourse semantics. The system we present is designed to assign a "situation entity" class label to each predicator in English text. The project goal is to achieve the best-possible identification of situation entities in naturally-occurring written texts by implementing a robust system that will deal with real corpus material, rather than just with constructed textbook examples of discourse. In this paper we focus on the combination of multiple information sources, which we see as being vital for a robust classification system. We use a deep syntactic grammar of English to identify morphological, syntactic, and discourse clues, and we use various lexical databases for fine-grained semantic properties of the predicators. Experiments performed to date show that enhancing the output of the grammar with information from lexical resources improves recall but lowers precision in the situation entity classification task.


Discourse semantics, Tools for linguistic analysis, Semantic classification, application of a deep grammar, lexical resources



Full Paper