A Machine Learning Approach to Automatic Functor Assignment in the Prague Dependency Treebank
Zdenek Zabokrtsky (Center for Computational Linguistics Charles University, Prague, Czech Republic)
Petr Sgall (Center for Computational Linguistics Charles University, Prague, Czech Republic)
Saso Dzeroski (Jozef Stefan Institute Ljubljana, Slovenia)
WO18: Syntactic Annotation
The aim of this paper is to describe and evaluate a system that automates a part of the transition from analytical to tectogrammatical tree structures within the Prague Dependency Treebank. In particular, it assigns functors to autosemantic words. The system is based on the machine learning approach of decision tree induction. The resulting software tool is incorporated into the annotation process and significantly reduces the manual annotation effort during the transition from analytical tree structures to the tectogrammatical tree structures, which consumes a huge amount of time of linguistic experts.