Title

Computational Linguistics at Universiti Sains Malaysia

Authors

Choy-Kim Chuah (School of Computer Science, Universiti Sains Malaysia)

Zaharin Yusoff (School of Computer Science, Universiti Sains Malaysia)

Session

WP6: Lrs & Projects

Abstract

This paper gives a brief history of UTMK, a computer-aided translation unit, and reports on her projects and research co-operations. After its beginnings as a thesis project on Malay affixation, UTMK's interest moved from machine translation to the development of tools for translation. Today, UTMK's focus is on the development of natural language processing applications and tools (internet browsers, and corpus and dictionary databases). And, continuing with its policy for research collaborations, UTMK is leading a three-country project to pool computing and linguistic resources and expertise on Malay. Due to historical reasons, bahasa Indonesia and bahasa Melayu, the Malay used respectively in Indonesia and in Malaysia have diverged with differences in vocabulary, pronunciation and spelling. For effective communication, a council was set up in 1972 to standardize the spelling and terminology used in the two countries. Brunei joined this council in 1985. To encourage studies on Malay, texts need to be available. However, resources in digital form are wanting. At a recent meeting, the council proposed to set up a Malay language portal to make linguistic resources from the three countries available on-line, and also to popularise Malay as a South-East Asian language. The participation of non-member countries in the portal is welcomed.

Keywords

Multilingual, Databases, Corpus, Malay, Dictionaries

Full Paper

8.pdf