Resources for Morphology Learning and Evaluation
Mike Maxwell (Linguistic Data Consortium)
Recently, there has been a proliferation of research into the acquisition of morphological grammars—that is, grammars and lexicons required for computer-based morphological analysis and synthesis. The approaches to acquiring such grammars range from tools which structure data provided by native speakers and linguists, to unsupervised machine learning. Despite this flurry of research into morphology learning, a means of comparing results among different approaches is largely lacking. This paper describes a test bench for morphology learning, which would assist designers of morphology learning programs by providing both training and evaluation data, and would allow comparison across programs. This paper is simultaneously a description of the projected form of the test bench, and a call for further input.
Morphology, Machine learning, Written corpora, Benchmarking, Evaluation