Related Word-pairs Extraction without Dictionaries
Eiko Yamamoto (1), Kyoji Umemura (2)
(1) National Institute of Information and Communications Technology, 3-5 Hikari-dai, Seika-cho, Souraku-gun, Kyoto, 619-0289 Japan, email@example.com; (2) Toyohashi University of Technology, 1-1 Tempaku, Toyohashi, Aichi, 441-8580 Japan, firstname.lastname@example.org
Although related pairs of words are useful lexical semantic resources, it is sometimes expensive to create and maintain the pairs. We propose a method that extracts pairs of related Japanese words from a text corpus, without the use of language knowledge, such as a dictionary, in any of the steps. This is difficult with a Japanese text because there are no spaces between words. The pairs are related words with similar usages and can be useful for understanding texts including unknown words. These related word pairs are extracted based on judgments of whether two words are used in a similar way. We report the precisions of pair lists extracted from various kinds of corpora and analyze the tendencies of each list.
word-pair extraction, word extraction