LREC 2000 2nd International Conference on Language Resources & Evaluation  
Home Basic Info Archaeological Zappeion Registration Conference

Conference Papers

Program
Papers
Sessions
Abstracts
Authors
Keywords
Search

Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377.

List of all papers and abstracts.


Previous Paper   Next Paper  

Title Bootstrapping a Tagged Corpus through Combination of Existing Heterogeneous Taggers
Authors Zavrel Jakub (CNTS / Language Technology Group, University of Antwerp, Universiteitsplein 1, 2610 Wilrijk, Belgium, zavrel@uia.ua.ac.be)
Daelemans Walter (CNTS / Language Technology Group, University of Antwerp, Universiteitsplein 1, 2610 Wilrijk, Belgium, daelem@uia.ua.ac.be)
Keywords Combining Systems, Machine Learning, Reuse of Resources, Tagging
Session Session WO1 - Corpus Tagging
Abstract This paper describes a new method, COMBI-BOOTSTRAP, to exploit existing taggers and lexical resources for the annotation of corpora with new tagsets. COMBI-BOOTSTRAP uses existing resources as features for a second level machine learning module, that is trained to make the mapping to the new tagset on a very small sample of annotated corpus material. Experiments show that COMBI-BOOTSTRAP: i) can integrate a wide variety of existing resources, and ii) achieves much higher accuracy (up to 44.7 % error reduction) than both the best single tagger and an ensemble tagger constructed out of the same small training sample.

 

ana">