A Registry of Standard Data Categories for Linguistic Annotation
Nancy Ide (1), Laurent Romary (2)
(1) Department of Computer Science, Vassar College, Poughkeepsie, NY 12604-0520, USA; (2) Equipe Langue et Dialogue, LORIA/INRIA,Vandoeuvre-lès-Nancy, FRANCE
In this paper we describe the most recent work within ISO TC37/SC 4, and in particular the development of a Data Category Registry (DCR) component of the Linguistic Annotation Framework. The DCR will contain a formally defined set of linguistic categories in common use within the language engineering community for reference and use in linguistically annotated resources. We outline the first proposals for creation and management of the DCR, as a solicitation for input from the community.
standards, linguistic annotation