Title NameNet: A Self-Improving Resource for Name Classification
Author(s) Paul Morarescu, Sanda Harabagiu

Human Language Technology Research Institute, Department of Computer Science, University of Texas at Dallas

Session O15-W
Abstract This paper presents a semantically structured resource of more than 1,600 Name Classes. This structure is based on the noun hypernymy hierarchies in WordNet, expanded and validated by corpus evidence collected from the World Wide Web. The set of seed examples provided by WordNet is boostrapped and the used to automatically construct an annotated training corpus for each Name Class. The resulting Named Entity resource enables a supervised Named Entity Recognizer to identify all the encoded Name Classes with high accuracy and without any human intervention.
Keyword(s) Named Entity Recognition, Information Extraction
Language(s) English
Full Paper 693.pdf