Exploring Balkanet Shared Ontology for Multilingual Conceptual Indexing


Sofia Stamou (1), Goran Nenadic (2), Dimitris Christodoulakis (1)

(1) Computer Engineering and Informatics Department, Patras University, 26500, and Research Academic Computer Technology Institute 61 Riga Feraiou, 26221, Patras, Greece, {stamou, dxri}@cti.gr; (2) Department of Computation, UMIST, Manchester, UK, G.Nenadic@umist.ac.uk




As the size of the Web grows, it becomes an imperative to equip search engines with sophisticated indexing modules in order to enable a meaningful organization of the stored data. In this paper we present a structured multilingual conceptual repository that has been employed as the backbone of a conceptual indexing and retrieval system. Our conceptual warehouse originates from a multilingual semantic network (Balkanet) and its Inter-Lingual-Index, which was enriched with domain ontology information inherited from the SUMO ontology. We report on the ontology's design principles and provide a description of its structure. We argue that an important attribute of the Balkanetís ILI is its flexibility in incorporating new concepts and/or languages by allowing the percolation of shared semantic attributes to all concepts represented within taxonomies. We further present our approach to conceptual indexing, and introduce an indexing algorithm that utilizes Balkanetís classified conceptual taxonomies. Finally, we discuss how conceptual taxonomies can help retrieval algorithms in making links between terms used in search requests and semantically related terms that might be found in the indexed documents.


Structured cross-language sense inventory, conceptual taxonomy, Wordnet, content-based indexing, multilingual resources

Language(s) English and Balkan languages, namely Greek, Turkish, Serbian, Romanian, Bulgarian, Czech
Full Paper