Estonian speech in the BABEL multilanguage database: phonetic-phonological problems revealed in the text corpus

Arvo Eek, Einar Meister

In the present paper we will discuss the main problems of Estonian phonology (determination of segmental phonemes, interrelations between phonemes and quantity degrees, as well as word stress and quantity degrees). Based on the corpus design of the EUROM_1 we have made for Estonian some modifications in the amount of speech blocks (totally 55 blocks), guaranteeing that all main phonologically relevant oppositions will be revealed in the text corpus. Isolated CVC constructions are used for the presentation of word-initial and word-final consonants, as well as for interconsonantal short/long vowels. The CVC material defines all possible CV- and VC-diphones on the basis of single/geminate consonants in combination with short/long vowels. It does not define the same corpus of diphones in different word accent contexts. Main types of word accent patterns are presented in filler sentences and in passages. The text corpus contains also diphthongs and consonant clusters. At the same time passages and filler sentences serve as the main material for the characterisation of intonational, phrasal and foot structures.

