Opportunistic Semantic Tagging


Luisa Bentivogli (ITC-irst, Trento (Italy))

Emanuele Pianta (ITC-irst, Trento (Italy))


WO15: Semantic Tagging


Building semantically annotated corpora from scratch is a time consuming activity requiring very specialized resources. In this paper we present a pilot study carried out to test a methodology that can be used to create a semantically annotated corpus by exploiting information contained in an already annotated corpus. The main hypothesis underlying the proposed methodology is that, given a text and its translation into another language, the translation preserves to a large extent the meaning of the source target. This means that if one of the two texts is already semantically tagged, and if we can align at the appropriate level the parallel texts, it should be possible to transfer the semantic annotation from the tagged text to its translation. More specifically, in our experiment we considered word level semantic annotation. The pilot study has been carried out on six texts taken from the SemCor corpus and their Italian translations. To test the methodology we implemented an annotation transfer system based on an English/Italian word aligner, developed at ITC-irst, which relies mostly on information contained in bilingual dictionaries.


MultiSemCor, Semantic tagging, Parallel corpus, Word alignment

