Summary of the paper

Title Language Resource Addition: Dictionary or Corpus?
Authors Shinsuke Mori and Graham Neubig
Abstract In this paper, we investigate the relative effect of two strategies of language resource additions to the word segmentation problem and part-of-speech tagging problem in Japanese. The first strategy is adding entries to the dictionary and the second is adding annotated sentences to the training corpus. The experimental results showed that the annotated sentence addition to the training corpus is better than the entries addition to the dictionary. And the annotated sentence addition is efficient especially when we add new words with contexts of three real occurrences as partially annotated sentences. According to this knowledge, we executed annotation on the invention disclosure texts and observed word segmentation accuracy.
Topics Morphology, Part-of-Speech Tagging
Full paper Language Resource Addition: Dictionary or Corpus?
