Definition, dictionaries and tagger for Extended Named Entity Hierarchy


Satoshi Sekine (1), Chikashi Nobata (2)

(1) New York University; (2) Communications Research Laboratory




The tagging of Named Entities, the names of particular things or classes, is regarded as an important component technology for many NLP applications. The first Named Entity set had 7 types, organization, location, person, date, time, money and percent expressions. Later, in the IREX project artifact was added and ACE added two, GPE and facility, to pursue the generalization of the technology. However, 7 or 8 kinds of NE are not broad enough to cover general applications. We proposed about 150 categories of NE (Sekine et al. 2002) and now we have extended it again to 200 categories. Also we have developed dictionaries and an automatic tagger for NEs in Japanese.


Named Entity, dictionary, tagger

Language(s) Japanese, English
