An XML Representation for Annotated Handwriting Datasets for Online Handwriting Recognition


Ajay S Bhaskarabhatla, Sriganesh Madhvanath

Hewlett-Packard Labs, India




In this paper, we briefy descibe an XML representation for annotation of online handwriting data to support the development and evaluation of handwriting recognition algorithms, that is based on the emerging Digital Ink Markup Language (InkML) draft standard from W3C. In particular, we describe how the XML representation we have de ned attempts to address issues of (i) support for different scripts, (ii) partial automation of labeling using recognition engines, (iii) planned as well as casual capture of handwriting data and (iv) semantic annotation of handwriting data at various levels such as character, word and phrase. The representation keeps the raw handwriting data (described by InkML) separate from its semantic interpretations. We also compare and contrast the XML representation with the extant UNIPEN representation for annotation of handwriting data.


annotation, online handwriting, InkML, Datasets

Language(s) Indic scripts
