Incremental Recognition and Referential Categorization of French Proper Names


Nordine Fourour (IRIN/University of Nantes, France)

Emmanuel Morin (IRIN/University of Nantes, France)

Beatrice Daille (IRIN/University of Nantes, France)


WP3: Tools & Components


This paper presents Nemesis, a French proper name (PN) recognizer for Large-scale Information Extraction (IE), whose specifications have been elaborated through corpus investigation both in terms of referential categories and graphical structures. The graphical criteria are used to identify proper names and the referential classification to categorize them. The system is a classical one: it is rule-based and uses specialized lexicons without any linguistic preprocessing. Its originality consists on a modular architecture which includes a learning process. The system up to now recognizes anthroponyms and toponyms with performance achieving 95% of precision and 90% of recall.


French proper names, Incremental recognition, Learning processing, Referential composition, Corpus investigation

Full Paper