Querying both time-aligned and hierarchical corpora with NXT Search


Ulrich Heid (1), Holger Voormann (1), Jan-Torsten Milde (2), Ulrike Gut (3), Katrin Erk (4), Sebastian Padó (4)

(1) Institut für maschinelle Sprachverarbeitung, University of Stuttgart, Azenbergstr. 12, 70174 Stuttgart, Germany, {heid, voormann}@ims.uni-stuttgart.de; (2) Fachhochschule Fulda, Marquardstrae 35, 36039 Fulda, Jan-Torsten.Milde@fh-fulda.de; (3) Englisches Seminar, Albert-Ludwigs-Universität Freiburg i.Br., Rempartstr. 15, 79098 Freiburg, Germany, ulrike.gut@anglistik.uni-freiburg.de; (4) Computerlinguistik, Universität des Saarlandes, Im Stadtwald 17, 66123 Saarbrücken, Germany, {erk, pado}@coli.uni-sb.de




One problem of the (re-)usability and exchange of annotated corpora is in the lack of standards in corpus formats and corpus query tools. This paper reports on the NXT Search tool, which was used to query two corpora with very different annotation formats. It is shown that with automatic data format conversion both corpora can be accessed and searched with NXT Search.


corpus query, XML corpora, cross-level observations, corpora tools, querying

Language(s) German, English
