Summary of the paper

Title Exploiting Multiply Annotated Corpora in Biomedical Information Extraction Tasks
Authors Barry Haddow and Beatrice Alex
Abstract This paper discusses the problem of utilising multiply annotated data in training biomedical information extraction systems. Two corpora, annotated with entities and relations, and containing a number of multiply annotated documents, are used to train named entity recognition and relation extraction systems. Several methods of automatically combining the multiple annotations to produce a single annotation are compared, but none produces better results than simply picking one of the annotated versions at random. It is also shown that adding extra singly annotated documents produces faster performance gains than adding extra multiply annotated documents.
Language Single language
Topics Information Extraction, Information Retrieval, Corpus (creation, annotation, etc.), Named Entity recognition
Full paper Exploiting Multiply Annotated Corpora in Biomedical Information Extraction Tasks
Slides Exploiting Multiply Annotated Corpora in Biomedical Information Extraction Tasks
Bibtex @InProceedings{HADDOW08.516,
  author = {Barry Haddow and Beatrice Alex},
  title = {Exploiting Multiply Annotated Corpora in Biomedical Information Extraction Tasks},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA