LREC 2000 2nd International Conference on Language Resources & Evaluation

Title Corpus Resources and Minority Language Engineering
Authors McEnery Tony (Department of Linguistics, Lancaster University, Bailrigg, Lancaster, LA1 4YT, UK,
Baker Paul (Department of Linguistics, Lancaster University, Bailrigg, Lancaster, LA1 4YT, UK,
Burnard Lou (Oxford University Computing Services, 13 Banbury Road, Oxford, OX2 6NN, UK)
Session Session WO12 - Language Resources: Infrastructural Issues
Full Paper, 187.pdf
Abstract Low density languages are typically viewed as those for which few language resources are available. Work relating to low density languages is becoming a focus of increasing attention within language engineering (e.g. Charoenporn, 1997, Hall and Hudson, 1997, Somers, 1997, Nirenberg and Raskin, 1998, Somers, 1998). However, much work related to low density languages is still in its infancy, or worse, work is blocked because the resources needed by language engineers are not available. In response to this situation, the MILLE (Minority Language Engineering) project was established by the Engineering and Physical Sciences Research Council (EPSRC) in the UK to discover what language corpora should be built to enable language engineering work on non-indigenous minority languages in the UK, most of which are typically low- density languages. This paper summarises some of the major findings of the MILLE project.