Summary of the paper

Title Corpus and Evaluation of Handwriting Recognition of Historical Genealogical Records
Authors Patrick Schone, Heath Nielson and Mark Ward
Abstract Over the last few decades, significant strides have been made in handwriting recognition (HR), which is the automatic transcription of handwritten documents. HR often focuses on modern handwritten material, but in the electronic age, the volume of handwritten material is rapidly declining. However, we believe HR is on the verge of having major application to historical record collections. In recent years, archives and genealogical organizations have conducted huge campaigns to transcribe valuable historical record content with such transcription being largely done through human-intensive labor. HR has the potential of revolutionizing these transcription endeavors. To test the hypothesis that this technology is close to applicability, and to provide a testbed for reducing any accuracy gaps, we have developed an evaluation paradigm for historical record handwriting recognition. We created a huge test corpus consisting of four historical data collections of four differing genres and three languages. In this paper, we provide the details of these extensive resources which we intend to release to the research community for further study. Since several research organizations have already participated in this evaluation, we also show initial results and comparisons to human levels of performance.
Topics Corpus (Creation, Annotation, etc.), Evaluation Methodologies
Full paper Corpus and Evaluation of Handwriting Recognition of Historical Genealogical Records
Bibtex @InProceedings{SCHONE14.253,
  author = {Patrick Schone and Heath Nielson and Mark Ward},
  title = {Corpus and Evaluation of Handwriting Recognition of Historical Genealogical Records},
  booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)},
  year = {2014},
  month = {may},
  date = {26-31},
  address = {Reykjavik, Iceland},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {978-2-9517408-8-4},
  language = {english}
Powered by ELDA © 2014 ELDA/ELRA