Title

Statistical Machine Translation on Paraphrased Corpora

Authors

Taro Watanabe (ATR Spoken Language Transaltion Laboratories 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288 JAPAN)

Mitsuo Shimohata (ATR Spoken Language Transaltion Laboratories 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288 JAPAN)

Eiichiro Sumita (ATR Spoken Language Transaltion Laboratories 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288 JAPAN)

Session

WO20: Machine Translation

Abstract

This paper presents a statistical machine translation trained on normalized corpora. The automatic paraphrasing is carried out by inducing paraphrasing expressions from a bilingual corpus. Then, the normalization is treated as a specic paraphrase of a given input determined by the frequency in a corpus. The experimental results on Japanese-to-English translation with normalized English corpus exhibited the reduction of word-error-rate by 8% and the improvement of subjective evaluation from 70% into 72.5%.

Keywords

Machine translation

Full Paper

134.pdf