LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title A Word-level Morphosyntactic Analyzer for Basque
Authors Aduriz I. (Universidad de Barcelona, Gran Vía de las Cortes Catalanas, 585, E-08007 Barcelona)
Agirre E. (Dept. of Computer Languages and Systems, University of the Basque Country, 649 P. K., E-20080 Donostia, Basque Country)
Aldezabal I. (Dept. of Computer Languages and Systems, University of the Basque Country, 649 P. K., E-20080 Donostia, Basque Country)
Arregi X. (Dept. of Computer Languages and Systems, University of the Basque Country, 649 P. K., E-20080 Donostia, Basque Country)
Arriola J. M. (Dept. of Computer Languages and Systems, University of the Basque Country, 649 P. K., E-20080 Donostia, Basque Country)
Artola X. (Dept. of Computer Languages and Systems, University of the Basque Country, 649 P. K., E-20080 Donostia, Basque Country)
Gojenola K. (Dept. of Computer Languages and Systems, University of the Basque Country, 649 P. K., E-20080 Donostia, Basque Country)
Maritxalar A. (Dept. of Computer Languages and Systems, University of the Basque Country, 649 P. K., E-20080 Donostia, Basque Country)
Sarasola K. (Dept. of Computer Languages and Systems, University of the Basque Country, 649 P. K., E-20080 Donostia, Basque Country)
Urkia M. (UZEI, Aldapeta 20 , E-20009 Donostia, Basque Country, jipgogak@si.ehu.es)
Keywords Agglutinative Languages, Morphology, Morphosyntax
Session Session WP2 - Corpus Annotation
Full Paper 44.ps, 44.pdf
Abstract This work presents the development and implementation of a full morphological analyzer for Basque, an agglutinative language. Several problems (phrase structure inside word-forms, noun ellipsis, multiplicity of values for the same feature and the use of complex linguistic representations) have forced us to go beyond the morphological segmentation of words, and to include an extra module that performs a full morphosyntactic parsing of each word-form. A unification-based word-level grammar has been defined for that purpose. The system has been integrated into a general environment for the automatic processing of corpora, using TEI-conformant SGML feature structures.