Finnish and Estonian Speech Applications developed on an Object-Oriented Speech Processing and Database System

Toomas Altosaar, Matti Karjalainen, Martti Vainio, Einar Meister

Full utilisation of information available in speech databases has not always been feasible due to the differing standards and formats employed. In addition, the extra diversity introduced by the multilingual aspect has made the analysis of speech databases even more difficult under a single computing environment.

In this paper we briefly present the QuickSig object oriented signal processing system [1] that represents a modern tool with which to perform DSP related studies. It empowers speech scientists to operate in a flexible and motivating environment where signals, filters, spectrograms, etc., are all modelled as objects. Seamlessly integrated to QuickSig is an object-oriented database [2] that permits signals along with their features and relations to be stored persistently between sessions in a manner that is transparent to the user. A multilingual phonetic representational system [3] exists within the same environment and allows speech from different databases (e.g., different languages and phonetic alphabets) to be modelled generically. Relations between speech units such as sentences, words, phones, etc., are defined explicitly forming a phonetic object structure for each utterance. Complex pattern matching searches can be easily formulated by the user and made to traverse the phonetic structures returning desired contexts. These speech events can then be used in actual applications.

The remainder of the paper presents some of the applications that have been developed on this platform where Finnish and Estonian databases have been used as the source speech material. These include speech synthesis [4,5], recognition [6], and speaker verification/identification [7].


