LREC 2012 Proceedings

Summary of the paper

Title	Automatic MT Error Analysis: Hjerson Helping Addicter
Authors	Jan Berka, Ondřej Bojar, Mark Fishel, Maja Popović and Daniel Zeman
Abstract	We present a complex, open source tool for detailed machine translation error analysis providing the user with automatic error detection and classification, several monolingual alignment algorithms as well as with training and test corpus browsing. The tool is the result of a merge of automatic error detection and classification of Hjerson (Popović, 2011) and Addicter (Zeman et al., 2011) into the pipeline and web visualization of Addicter. It classifies errors into categories similar to those of Vilar et al. (2006), such as: morphological, reordering, missing words, extra words and lexical errors. The graphical user interface shows alignments in both training corpus and test data; the different classes of errors are colored. Also, the summary of errors can be displayed to provide an overall view of the MT system's weaknesses. The tool was developed in Linux, but it was tested on Windows too.
Topics	Evaluation methodologies, Machine Translation, SpeechToSpeech Translation, Tools, systems, applications
Full paper	Automatic MT Error Analysis: Hjerson Helping Addicter
Bibtex	@InProceedings{BERKA12.336, author = {Jan Berka and Ondřej Bojar and Mark Fishel and Maja Popović and Daniel Zeman}, title = {Automatic MT Error Analysis: Hjerson Helping Addicter}, booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)}, year = {2012}, month = {may}, date = {23-25}, address = {Istanbul, Turkey}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-7-7}, language = {english} }