Verb Valency Descriptors for a Syntactic Treebank
Bulgarian Academy of Sciences, Institute for Parallel Processing, 25A, Acad. G. Bonchev St, 1113 Sofia, Bulgaria
An essential component of Language Engineering (LE)tools are verb class descriptors that provide information about the relations of the predicates to their arguments. The production of computationally tractable language resources necessitates the assignment of types of predicate-argument relations to a great variety of verb-centered structures: it is necessary to define not only the initial, canonical valency frame of a great number of verb lexemes, but also the diathesis alternations, which reflect the real-life usage of verbs. This paper describes the implementation of descriptors of the valency properties of Bulgarian verbs used in the production of a syntactic treebank of Bulgarian. The descriptors are based on available LE resources for Bulgarian: a verb subcategorization model implemented in the lexical data base that is used; a chunk grammar that recognizes verb form patterns. Predictive models are built and applied in a grammar that annotates grammatical relations inferred from the combination of morphosyntactic and shallow syntactic processing cues. The real significance of this particular processing is the resolution, in relation to the valency properties of many verbs, of the discrepancy or the contradiction between the verb lexicon specifications and the verb syntagmatic realization.
verb valency, diathesis alternations, lexicon, shallow syntactic processing