Summary of the paper

Title Is this NE tagger getting old?
Authors Cristina Mota and Ralph Grishman
Abstract This paper focuses on the influence of changing the text time frame on the performance of a named entity tagger. We followed a twofold approach to investigate this subject: on the one hand, we analyzed a corpus that spans 8 years, and, on the other hand, we assessed the performance of a name tagger trained and tested on that corpus. We created 8 samples from the corpus, each drawn from the articles for a particular year. In terms of corpus analysis, we calculated the corpus similarity and names shared between samples. To see the effect on tagger performance, we implemented a semi-supervised name tagger based on co-training; then, we trained and tested our tagger on those samples. We observed that corpus similarity, names shared between samples, and tagger performance all decay as the time gap between the samples increases. Furthermore, we observed that the corpus similarity and names shared correlate with the tagger F-measure. These results show that named entity recognition systems may become obsolete in a short period of time.
Language Language-independent
Topics Named Entity recognition, Corpus (creation, annotation, etc.), Acquisition, Machine Learning
Full paper Is this NE tagger getting old?
Slides Is this NE tagger getting old?
Bibtex @InProceedings{MOTA08.303,
  author = {Cristina Mota and Ralph Grishman},
  title = {Is this NE tagger getting old?},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {},
  language = {english}

Powered by ELDA © 2008 ELDA/ELRA