See Instructions for authors.
This workshop, which is held in conjunction with the First International Conference on Language Resources and Evaluation in Granada, Spain, will be concerned with the design, production and transcription standards required for the construction of speech databases for languages of Central and Eastern Europe.
Speech databases have been produced for a number of the world's major languages, but most languages of Central and Eastern Europe have received little attention in international terms until recently, though they are of major importance for the future of European speech science. There are special issues which arise in the production of representative samples of these languages, and this workshop will attempt to address these issues. The BABEL project (funded by the European Union under the COPERNICUS programme, project #1304) has been working on these issues since 1995, and will soon complete a database of Bulgarian, Estonian, Hungarian, Polish and Romanian. The work of the project will be reported at the workshop, and aspects of the project will be the subject of practical demonstrations, but it is hoped that papers will be contributed by other interested researchers who are not associated with the project.
Information about BABEL can be read on its WWW
Information about the main conference can be read on it's WWW pages
|Programme||Time||Author(s)||Title & link to abstract|
|Paper 1||14:40||Arvo Eek, Einar Meister||Estonian speech in the BABEL multilanguage database: phonetic-phonological problems revealed in the text corpus|
|Paper 2||15:00||SlawomirKula||Telephone bandwidth speech database: creation, applications and experiences for polish language|
|Paper 3||15:20||Henk van den Heuvel, Valery Galounov, Herbert S. Tropf||The SPEECHDAT(E) project: Creating speech dtabases for eastern European languages|
|Open Forum 1||15:40||The nature of our data|
|Paper 4||16:00||Klara Vicsi, A. Vig, G. Gordos||Experience on the development of a language independent automatic segmentation and labeling system on the frame of the BABEL project|
|Paper 5||16:20||Simon Dobrisek, Jerneja Gros, France Mihelic, Nikola Pavesic||GOPOLIS: A Multi Speaker Solvenian Speech Database|
|Paper 6||17:00||Toomas Altosaar, Matti Karjalainen, Martti Vainio, Einar Meister||Finnish and Estonian Speech Applications developed on an Object-Oriented Speech Processing and Database System|
|Open Forum 2||17:20||Labelling and annotation|
|Paper 7||17:45||Marian Boldea, Cosmin Munteanu, Alin Doroga||Design, Collection, and Annotation of a Romanian Speech Database|
|Paper 8||18:05||Tamas Varadi||On the Spoken Corpus of the Budapest Sociolinguistic Interview|
|Paper 9||18:25||Zdravko Kacic, Janez Kaiser||Development of Slovenian SpeechDat database|
|Open Forum 3||18:45||The Future|
Peter Roach, Department of Linguistic Science, University of Reading,
Reading RG6 6AA, UK.
Tel: (+44) 118 931 8138 Fax: (+44) 118 9753365
We hope that the following topics can be considered in the workshop; this list is not exclusive, however.
THE WORKSHOP WILL CONCLUDE WITH A DISCUSSION OF THE POSSIBILITY OF FORMING AN INFORMAL ASSOCIATION OF RESEARCHERS SPECIALISING IN THE SPOKEN FORMS OF CENTRAL AND EASTERN EUROPEAN LANGUAGES.