Summary of the paper

Title Annotation Process Management Revisited
Authors Dain Kaplan, Ryu Iida and Takenobu Tokunaga
Abstract Proper annotation process management is crucial to the construction of corpora, which are in turn indispensable to the data-driven techniques that have come to the forefront in NLP during the last two decades. It is still common to see ad-hoc tools created for a specific annotation project, but it is time this changed; creation of such tools is labor and time expensive, and is secondary to corpus creation. In addition, such tools likely lack proper annotation process management, increasingly more important as corpora sizes grow in size and complexity. This paper first raises a list of ten needs that any general purpose annotation system should address moving forward, such as user & role management, delegation & monitoring of work, diffing & merging annotators’ work, versioning of corpora, multilingual support, import/export format flexibility, and so on. A framework to address these needs is then proposed, and how having proper annotation process management can be beneficial to the creation and maintenance of corpora explained. The paper then introduces SLATE (Segment and Link-based Annotation Tool Enhanced), the second iteration of a web-based annotation tool, which is being rewritten to implement the proposed framework.
Topics Tools, systems, applications, Corpus (creation, annotation, etc.), LR Infrastructures and Architectures
Full paper Annotation Process Management Revisited
Slides -
Bibtex @InProceedings{KAPLAN10.129,
  author = {Dain Kaplan and Ryu Iida and Takenobu Tokunaga},
  title = {Annotation Process Management Revisited},
  booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
Powered by ELDA © 2010 ELDA/ELRA