The Automatic Content Extraction (ACE) Program - Tasks, Data, and Evaluation


George Doddington (1), Alexis Mitchell (2), Mark Przybocki (1), Lance Ramshaw (3), Stephanie Strassel (2), Ralph Weischedel (3)

(1) NIST - Gaithersburg, MD; (2) LDC - Philadelphia, PA; (3) BBN - Cambridge, MA




The objective of the ACE program is to develop technology to automatically infer from human language data the entities being mentioned, the relations among these entities that are directly expressed, and the events in which these entities participate. Data sources include audio and image data in addition to pure text, and Arabic and Chinese in addition to English. The effort involves defining the research tasks in detail, collecting and annotating data needed for training, development, and evaluation, and supporting the research with evaluation tools and research workshops. This program began with a pilot study in 1999. The next evaluation is scheduled for September 2004.


extraction, semantics, analysis, corpora, annotation, evaluation, entities, relations, events

Language(s) Arabic, Chinese, English
Full Paper