Summary of the paper

Title Evaluation of Dictionary Creating Methods for Finno-Ugric Minority Languages
Authors Zsanett Ferenczi, Iván Mittelholcz, Eszter Simon and Tamás Váradi
Abstract In this paper, we present the evaluation of several bilingual dictionary building methods applied to {Komi-Permyak, Komi-Zyrian, Hill Mari, Meadow Mari, Northern Saami, Udmurt}-{English, Finnish, Hungarian, Russian} language pairs. Since these Finno-Ugric minority languages are under-resourced and standard dictionary building methods require a large amount of pre-processed data, we had to find alternative methods. In a thorough evaluation, we compare the results for each method, which proved our expectations that the precision of standard lexicon building methods is quite low for under-resourced languages. However, utilizing Wikipedia title pairs extracted via inter-language links and Wiktionary-based methods provided useful results. The newly created word pairs enriched with several linguistic information are to be deployed on the web in the framework of Wiktionary. With our dictionaries, the number of Wiktionary entries in the above mentioned Finno-Ugric minority languages can be multiplied.
Topics Evaluation Methodologies, Endangered Languages, Lexicon, Lexical Database
Full paper Evaluation of Dictionary Creating Methods for Finno-Ugric Minority Languages
Bibtex @InProceedings{FERENCZI18.312,
  author = {Zsanett Ferenczi and Iván Mittelholcz and Eszter Simon and Tamás Váradi},
  title = "{Evaluation of Dictionary Creating Methods for Finno-Ugric Minority Languages}",
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {May 7-12, 2018},
  address = {Miyazaki, Japan},
  editor = {Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {979-10-95546-00-9},
  language = {english}
