Summary of the paper

Title Extending Standoff Annotation
Authors Maik Stührenberg
Abstract Textual information is sometimes accompanied by additional encodings (such as visuals). These multimodal documents may be interesting objects of investigation for linguistics. Another class of complex documents are pre-annotated documents. Classic XML inline annotation often fails for both document classes because of overlapping markup. However, standoff annotation, that is the separation of primary data and markup, is a valuable and common mechanism to annotate multiple hierarchies and/or read-only primary data. We demonstrate an extended version of the XStandoff meta markup language, that allows the definition of segments in spatial and pre-annotated primary data. Together with the ability to import already established (linguistic) serialization formats as annotation levels and layers in an XStandoff instance, we are able to annotate a variety of primary data files, including text, audio, still and moving images. Application scenarios that may benefit from using XStandoff are the analyzation of multimodal documents such as instruction manuals, or sports match analysis, or the less destructive cleaning of web pages.
Topics Multimedia Document Processing, Tools, Systems, Applications
Full paper Extending Standoff Annotation
Bibtex @InProceedings{STHRENBERG14.308,
  author = {Maik Stührenberg},
  title = {Extending Standoff Annotation},
  booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)},
  year = {2014},
  month = {may},
  date = {26-31},
  address = {Reykjavik, Iceland},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {978-2-9517408-8-4},
  language = {english}
Powered by ELDA © 2014 ELDA/ELRA