Summary of the paper

Title Universal Dependencies Version 2 for Japanese
Authors Masayuki Asahara, Hiroshi Kanayama, Takaaki Tanaka, Yusuke Miyao, Sumire Uematsu, Shinsuke Mori, Yuji Matsumoto, Mai Omura and Yugo Murawaki
Abstract The Universal Dependencies (UD) project (McDonald et al., 2013) has defined a consistent, crosslinguistic target and syntactic structure representation format. In this presentation, we will show the work of the UD Japanese team. The UD Japanese team was organised by interested people who are developing their own treebanks or parsers. We developed and maintained several UD guidelines (version 2.0) compatible data for Japanese. Most of the data are made through automatic conversion from the existing treebank. The UD annotation guideline was updated from version 1 to version 2 in early 2017. The automatic conversion enabled us to adapt the existing annotation based on traditional Japanese grammar conventions for the UD annotation guideline changes. In this paper, we discuss the current issues of UD Japanese resources until today. These issues come from the difficulty to perform cross-linguistically consistent annotation for the different grammatical system from western European languages. The points at the issues related to the conversions are split into the delimitation (word, phrase and clause), undefined policies of UD guideline, typological systems for UD, and copyright of Japanese language resources.
Topics Part-Of-Speech Tagging, Corpus (Creation, Annotation, Etc.), Other
Full paper Universal Dependencies Version 2 for Japanese
Bibtex @InProceedings{ASAHARA18.276,
  author = {Masayuki Asahara and Hiroshi Kanayama and Takaaki Tanaka and Yusuke Miyao and Sumire Uematsu and Shinsuke Mori and Yuji Matsumoto and Mai Omura and Yugo Murawaki},
  title = "{Universal Dependencies Version 2 for Japanese}",
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {May 7-12, 2018},
  address = {Miyazaki, Japan},
  editor = {Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {979-10-95546-00-9},
  language = {english}
Powered by ELDA © 2018 ELDA/ELRA