Summary of the paper

Title New Inflectional Lexicons and Training Corpora for Improved Morphosyntactic Annotation of Croatian and Serbian
Authors Nikola Ljubešić, Filip Klubička, Željko Agić and Ivo-Pavao Jazbec
Abstract In this paper we present newly developed inflectional lexcions and manually annotated corpora of Croatian and Serbian. We introduce hrLex and srLex - two freely available inflectional lexicons of Croatian and Serbian - and describe the process of building these lexicons, supported by supervised machine learning techniques for lemma and paradigm prediction. Furthermore, we introduce hr500k, a manually annotated corpus of Croatian, 500 thousand tokens in size. We showcase the three newly developed resources on the task of morphosyntactic annotation of both languages by using a recently developed CRF tagger. We achieve best results yet reported on the task for both languages, beating the HunPos baseline trained on the same datasets by a wide margin.
Topics Lexicon, Lexical Database, Part-of-Speech Tagging, Collaborative Resource Construction
Full paper New Inflectional Lexicons and Training Corpora for Improved Morphosyntactic Annotation of Croatian and Serbian
Bibtex @InProceedings{LJUBEI16.340,
  author = {Nikola Ljubešić and Filip Klubička and Željko Agić and Ivo-Pavao Jazbec},
  title = {New Inflectional Lexicons and Training Corpora for Improved Morphosyntactic Annotation of Croatian and Serbian},
  booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
  year = {2016},
  month = {may},
  date = {23-28},
  location = {Portoro┼ż, Slovenia},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {978-2-9517408-9-1},
  language = {english}
 }
Powered by ELDA © 2016 ELDA/ELRA