Valency Dictionary of Czech Verbs: Complex Tectogrammatical Annotation
Markéta Stranáková-Lopatková (Center for Computational Linguistics Faculty of Mathematics and Physics, Charles University Malostransk´e n´am. 25, CZ-11800 Prague, Czech Republic)
Zdenek Zabokrtsý (Center for Computational Linguistics Faculty of Mathematics and Physics, Charles University Malostransk´e n´am. 25, CZ-11800 Prague, Czech Republic)
A lexicon containing a certain kind of syntactic information about verbs is one of the crucial prerequisities for most tasks in Natural Language Processing. The goal of the project described in the paper is to create a human- and machine-readable lexicon capturing in detail valency behavior of hundreds most frequent Czech verbs. Manual annotation effort consumed at this project limits the speed of its growth on the one hand, but guarantees significantly higher data consistency than that of automatically acquired lexicons. In this paper, we outline the theoretical background on which the lexicon is based, and describe the annotation schema (lexicon data structure, annotation tools, etc.). Selected quantitative characteristics of the lexicon are presented as well.
Czech verbs, Lexicons