Title Bring vs. MTRoget: Evaluating Automatic Thesaurus Translation
Authors Lars Borin, Jens Allwood and Gerard De Melo
Abstract "Evaluation of automatic language-independent methods for language technology resource creation is difficult, and confounded by a largely unknown quantity, viz. to what extent typological differences among languages are significant for results achieved for one language or language pair to be applicable across languages generally. In the work presented here, as a simplifying assumption, language-independence is taken as axiomatic within certain specified bounds. We evaluate the automatic translation of Roget's ""Thesaurus"" from English into Swedish using an independently compiled Roget-style Swedish thesaurus, S.C. Bring's ""Swedish vocabulary arranged into conceptual classes"" (1930). Our expectation is that this explicit evaluation of one of the thesaureses created in the MTRoget project will provide a good estimate of the quality of the other thesauruses created using similar methods."
Topics Multilinguality, Statistical and Machine Learning Methods
