Title

The Integrated Language Database of 8th - 21st-Century Dutch

Author(s)

J. G. Kruyt

Institute for Dutch Lexicology INL, NL-2300 RA Leiden, The Netherlands

Session

P19-SW

Abstract

The Institute for Dutch Lexicology (INL) has a long-standing tradition in corpus-based lexicography. The results include electronic scholarly dictionaries of Dutch covering the vocabulary from 1200 up to 1976, linguistically annotated electronic text corpora of historical and present-day Dutch, and computational lexica. Added value to these data is given in an on-going long-term INL project, the Integrated Language Database of 8th-21st-Century Dutch (ILD). The aim is to create a flexible linguistic research instrument by linking the dictionaries, a balanced diachronic text corpus and lexica of historical and present-day Dutch. We will link part of our data with data collections stored at other institutes, creating a supra-institutional research instrument. The paper reports on the overall ILD design and the user's perspective. Focus is on the ILD prototype which, when finished, will function as a demonstration model to verify and assess user needs. It now functions to test the design empirically for its applicability to 'real data', as well as to obtain figures on workload, etc. The conclusion is that the latter function proved the prototype to be an indispensable pilot for the ILD.

Keyword(s)

diachronic language database, various linguistic datatypes, TEI encoding, PoS and headword encoding, prototype, user's perspective

Language(s) present-day and historical Dutch
Full Paper

88.pdf