LREC 2000 2nd International Conference on Language Resources & Evaluation

Title Tools for the Generation of Morphological Entries in Dictionaries
Authors Viks Ülle (Institute of the Esthonian Language, Rooskrantsi 6, EE 10119 Tallinn, Esthonia,
Keywords Formal Descriptions, Grammatical Data, Linguistic Software, Rule-Based Morphology, Traditional Dictionaries
Session Session WP1 - Lexicon
Abstract he lexicographer's tool introduced in the report represents a semiautomatic system to generate the section of morphological information for Estonian words in dictionary entries. Estonian is a language with a complicated morphology featuring (1) rich inflection and (2) marked and diverse morpheme variation, applying both to stems and formatives. The kernel of the system is a rule-based automatic morphology with separate program modules for every linguistic subsystem such as syllabification, recognition of part of speech and type of inflection, stem variation, morpheme and allomorph combinatorics. The modules function as rule interpreters applying formal grammars in an editable text format. The system enables generation of the following: (1) part of speech, (2) type of inflection, (3) inflected forms, (4) morphonological marking: degree of quantity, morpheme boundaries (stem+formative, component boundaries in compounds), (5) morphological references for inflected forms considerably different from the headword. The system permits of set-up, so that the inflected forms to be generated, the style of morphonological marking and the criteria for reference selection are all up to the user to choose. Full automation of the system application is restricted mainly by morphological homonymy.