Title

Cross-effective cross-lingual document classification

Author(s)

Núria Bel (1), Cornelis H.A. Koster (2), Marta Villegas (3)

(1) IULA, Universitat Pompeu Fabra, (2) Computer Science Dept., University of Nijmegen, (3) Grup d'Investigació en Lingüística Computacional, Universitat de Barcelona,

Session

P23-W

Abstract

This article addresses the question of how to deal with text categorization when the set of documents to be classified belong to different languages. The figures we provide demonstrate that cross-lingual classification where a classifier is trained using one language and tested against another is possible and feasible provided we translate a small number of words: the most relevant terms for class profiling. The experiments we report, demonstrate that the translation of these most relevant words proves to be a cost-effective approach to cross-lingual classification.

Keyword(s)

multilingual lexical resources, document classification, crosslingual document classification.

Language(s) English, Spanish
Full Paper

201.pdf