LREC 2000 2nd International Conference on Language Resources & Evaluation

Title A Software Toolkit for Sharing and Accessing Corpora Over the Internet
Authors Luz Saturnino (Natural Interactive Systems Laboratory, University of Southern Denmark, Forskerparken 10, 5000 Odense, Denmark,
Keywords Corpus Processing, Distributed Corpora, Distributed Processing, Translation Studies, Translational English
Session Session WP8 - Corpus Tools
Full Paper, 337.pdf
Abstract This paper describes the Translational English Corpus (TEC) and the software tools developed in order to enable the use of the corpus remotely, over the internet. The model underlying these tools is based on an extensible client-server architecture implemented in Java. We discuss the data and processing constraints which motivated the TEC architecture design and its impact on the efficiency and scalability of the system. We also suggest that the kind of distributed processing model adopted in TEC could play a role in fostering the availability of corpus linguistic resources to the research community.