LREC 2000 2nd International Conference on Language Resources & Evaluation

Title Production of NLP-oriented Bilingual Language Resources from Human-oriented dictionaries
Authors Fluhr-Semenova Vera (SCIPER 46 rue du Moulin a Tan, 91150 Etampes, France,
Fluhr Christian (SCIPER 46 rue du Moulin a Tan, 91150 Etampes, France,
Brisson Stéphanie (SCIPER 46 rue du Moulin a Tan, 91150 Etampes, France , email:
Session Session WP4 - Lexicon: Semantic and Multilingual Issues
Full Paper, 328.pdf
Abstract In this paper, the main features of manually produced bilingual dictionaries, which have been originally designed for human use, are considered. The problem is to find the way to use such kind of dictionaries in order to produce bilingual language resources that could make a base for automate text processing, such as machine translation, cross-lingual interrogation in text retrieval, etc. The transformation technology suggested hereby is based on XML-parsing of the file obtained from the source data by means of serial of special procedures. In order to produce well-formed XML-file, automatic procedures suffice. But in most cases, there are still semantic problems and inconveniencies that could be retired only in interactive way. However, the volume of this work can be minimized due to automatic pre-editing and suitable XML mark-up. The paper presents the results of R&D project which was carried out in the framework of ELRA’1999 Call for proposals on Language resources Production. The paper is based on the authors’ experience with English-Russian and French-Russian dictionaries, but the technology can be applied to other pairs of languages.