Evaluation Resources for Concept-based Cross-Lingual Information Retrieval in the Medical Domain


Paul Buitelaar (1), Diana Steffen (1), Martin Volk (2), Dominic Widdows (3), Bogdan Sacaleanu (1), Špela Vintar (1), Stanley Peters (3), Hans Uszkoreit (1)

(1) DFKI GmbH, Stuhlsatzenhausweg 3, 66123 Saarbrücken, Germany, {paulb, bogdan, uszkoreit}@dfki.de; (2) -- previously at -- Eurospider Information Technology AG, Schaffhauserstrasse 18, CH-8006 Zürich, Switzerland, volk@ling.su.se; (3) Stanford University, CSLI, 220 Panama Street, Stanford, CA 94305-4115, USA, {dwiddows, peters}@csli.stanford.edu




The paper describes evaluation resources for concept-based, cross-lingual information retrieval in the medical domain. All resources were constructed in the context of the MuchMore project and are freely available through the project website. Available resources include: a bilingual, parallel document collection of German and English medical scientific abstracts, a set of queries and corresponding relevance assessments, two manually disambiguated test sets for semantic annotation (sense disambiguation), two evaluation lists for German morphological decomposition of medical terms.


Cross-Lingual Information Retrieval , Evaluation, Semantic Annotation

Language(s) German (Deutsch), English
Full Paper