Evaluation of a Vector Space Similarity Measure in a Multilingual Framework
Romaric Besanšon (Artificial Intelligence Laboratory, Swiss Federal Institute of Technology, CH-1015 Lausanne, Switzerland)
Martin Rajman (Artificial Intelligence Laboratory, Swiss Federal Institute of Technology, CH-1015 Lausanne, Switzerland)
EO5: Lexical Evaluation
In this contribution, we propose a method that uses a multilingual framework to validate the relevance of the notion of vector based semantic similarity between texts. The goal is to verify that vector based semantic similarities can be reliably transfered from one language to another. More precisely, the idea is to test whether the relative positions of documents in a vector space associated with a given source language are close to the ones of their translations in the vector space associated with the target language. The experiments, carried out with both the standard Vector Space model and the more advanced DSIR model, have given very promising results.
Textual similarity, Vector space representation, Evaluation, Multilingual framework