Automatic Morphological Segmentation for Continuous Speech Recognition of  Basque


K. López de Ipiña (Sistemen Ingeniaritza eta Automatika Saila Gasteiz. University of the Basque Country.)

N.Ezeiza (IXA Taldea. University of the Basque Country.)

G.Bordel (Elektrika eta Elektronika Saila, Bilbo. University of the Basque Country.)


SP3 Annotation Tools: From Speech Segments To Dialogues


The selection of appropriate Lexical Units (LUs) is an important issue in the development of Continuous Speech Recognition (CSR) systems. Word has been used classically as unit in most of them. However, proposals of non-word units have begun to arise. Since the subject of this study is the Basque language, which is an agglutinative language with a complex structure inside words, non-word units could be an appropriate choice. In this work an automatic morphological segmentation tool oriented to CSR tasks is presented.


Automatic morphological segmentation, Continuous speech recognition, Morphological analysis, Basque language, Language modelling

Full Paper