Summary of the paper

Title A Multi-Genre SMT System for Arabic to French
Authors Saša Hasan and Hermann Ney
Abstract This work presents improvements of a large-scale Arabic to French statistical machine translation system over a period of three years. The development includes better preprocessing, more training data, additional genre-specific tuning for different domains, namely newswire text and broadcast news transcripts, and improved domain-dependent language models. Starting with an early prototype in 2005 that participated in the second CESTA evaluation, the system was further upgraded to achieve favorable BLEU scores of 44.8% for the text and 41.1% for the audio setting. These results are compared to a system based on the freely available Moses toolkit. We show significant gains both in terms of translation quality (up to +1.2% BLEU absolute) and translation speed (up to 16 times faster) for comparable configuration settings.
Language Multiple languages
Topics Machine Translation, SpeechToSpeech Translation, Corpus (creation, annotation, etc.), Tools, systems, applications
Full paper A Multi-Genre SMT System for Arabic to French
Slides A Multi-Genre SMT System for Arabic to French
Bibtex @InProceedings{HASAN08.549,
  author = {Saša Hasan and Hermann Ney},
  title = {A Multi-Genre SMT System for Arabic to French},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA