Computational Linguistics at Universiti Sains Malaysia


Choy-Kim Chuah (School of Computer Science, Universiti Sains Malaysia)

Zaharin Yusoff (School of Computer Science, Universiti Sains Malaysia)


WP6: Lrs & Projects


This paper gives a brief history of UTMK, a computer-aided translation unit, and reports on her projects and research co-operations. After its beginnings as a thesis project on Malay affixation, UTMK's interest moved from machine translation to the development of tools for translation. Today, UTMK's focus is on the development of natural language processing applications and tools (internet browsers, and corpus and dictionary databases). And, continuing with its policy for research collaborations, UTMK is leading a three-country project to pool computing and linguistic resources and expertise on Malay. Due to historical reasons, bahasa Indonesia and bahasa Melayu, the Malay used respectively in Indonesia and in Malaysia have diverged with differences in vocabulary, pronunciation and spelling. For effective communication, a council was set up in 1972 to standardize the spelling and terminology used in the two countries. Brunei joined this council in 1985. To encourage studies on Malay, texts need to be available. However, resources in digital form are wanting. At a recent meeting, the council proposed to set up a Malay language portal to make linguistic resources from the three countries available on-line, and also to popularise Malay as a South-East Asian language. The participation of non-member countries in the portal is welcomed.


Multilingual, Databases, Corpus, Malay, Dictionaries

Full Paper