Bootstrapping Large Sense Tagged Corpora
Rada F. MIHALCEA (University of Texas at Dallas Richardson, Texas, 75083-0688)
WO15: Semantic Tagging
The performance of Word Sense Disambiguation systems largely depends on the availability of sense tagged corpora. Since the semantic annotations are usually done by humans, the size of such corpora is limited to a handful of tagged texts. This paper proposes a generation algorithm that may be used to automatically create large sense tagged corpora. The approach is evaluated through comparative sense disambiguation experiments performed on data provided during the SENSEVAL-2 English all words and English lexical sample tasks.